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INTRODUCTION 


Set 

The term set is the collection of well-defined objects, which are called the elements of the sets. 
Georg Cantor (1845-1915) developed set theory. Sets are used throughout the theory of 
computation. 


Some examples of sets are given below: 

• The set of vowels in English alphabet 

• The set of natural numbers less than 12. 

• The set of odd numbers less than 20. 


There are three methods to describe a set: 

a. Enumeration method-listing 

b. Standard method or description method 

c. Set builder method 

a. Enumeration method: When the elements of a set are enumerated or listed, we enclose 
them in braces. 

e.g. A={1, 2, 3,.,100} 

b. Standard method : Frequently used sets are usually given symbols that are reserved for them 
only. 

e.g. N (Natural Numbers)= {1, 2, 3,. , n} 

c. Set builder method : Another way of representing set is to use set builder method, 
e.g. we could define the rational numbers as 

Q={x/y: x, yeZ; y^O} 

Types of sets 

a. Empty set : A set, which has no element, is called as empty set or null set or void set. It is 
denoted by O 

b. Singleton set : A set, which has single element, is called as singleton set. 

c. Disjoint sets: Two or more sets are said to be disjoints, if there are no common elements 
among. 

d. Overlapping sets : Two or more sets are said to be disjoints, if there are at least one 
common element among them. 

e. Finite sets : A set having specified number of elements is called as a finite set. 

f. Infinite sets : A set is called infinite set, if it is not finite set. 

g. Universal set: The set of all objects or things under consideration in discussion is called the 
universal set. 


Relation 

A relation is a correspondence between two sets (called the domain and the range) such that to 
each element of the domain, there is assigned one or more elements of the range. 




TPL(anithub. technovative@gmail.com) 


Page 3 












TOC preparation kit 2012 


State the domain and range of the following relation. Is the relation ajunctian? !(2, -3), (4, 6), (3, 

-1), (6, 61 (2, 3)} 

The above list of points, being a relationship between certain x's and certain y's, is a relation. The 
domain is all the x-values, and the range is all the y-values. To give the domain and the range, I 
just list the values without duplication: domain: {2, 3, 4, 6} range: {-3, -1, 3, 6} 

Types of relation 

Identity relation: In an identity relation "R", every element of the set “A” is related to itself only. 
Note the conditions conveyed through words “every” and “only”. The word “every” conveys that 
identity relation consists of ordered pairs of element with itself - all of them. The word “only” 
conveys that this relation does not consist of any other combination. 

Consider a set 

A={1,2,3} Then, its identity relation is: R={(1,1), (2,2),(3,3)} 

Reflexive relation 

In reflexive relation, "R", every element of the set “A” is related to itself. The definition of 
reflexive relation is exactly same as that of identity relation except that it misses the word “only” 
in the end of the sentence. The implication is that this relation includes identity relation and 
permits other combination of paired elements as well. 

Consider a set 

A={1,2,3} Then, one of the possible reflexive relations can be: 

R={(U), (2,2), (3,3), (1,2), (1,3)} 

However, following is not a reflexive relation: R1={(1,1), (2,2), (1,2), (1,3)} 

Symmetric relation 

In symmetric relation, the instance of relation has a mirror image. It means that if (1,3) is an 
instance, then (3,1) is also an instance in the relation. Clearly, an ordered pair of element with itself 
like (1,1) or (2,2) is themselves their mirror images. Consider some of the examples of the 
symmetric relation, 

R1={(1,2),(2,1),(1,3),(3,1)} 

R2={(1,2),(1,3),(2,1),(3,1),(3,3)} 

We have purposely jumbled up ordered pairs to emphasize that order of elements in relation is not 
important. In order to decide symmetry of a relation, we need to identify mirror pairs. We state the 
condition of symmetric relation as: Iff(x,y)GR=>(y,x)GR for all x,yGA 

The symbol “Iff’ means “If and only if’. Here one directional arrow means “implies”. 
Alternatively, the condition of symmetric relation can be stated as: xRy=>yRx for all x,yEA 

Transitive relation 

If “R” be the relation on set A, then we state the condition of transitive relation as: Iff(x,y)GR and 
(y,z)GR=>(x,z)GR for all a,b,cGA 
Alternatively, xRy and yRz=>xRz for all x,y,zGA 

Equivalence relation 
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A relation is equivalence relation if it is reflexive, symmetric and transitive at the same time. In 
order to check whether a relation is equivalent or not, we need to check all three characterizations. 


Function 

A function is a correspondence between two sets (called the domain and the range) such that to 
each element of the domain, there is assigned exactly one element of the range. 

Determine the domain and range _ of the given function: 

y = —*J—2x +3 

The domain is all values that x can 1 take on. The only problem I have with 

this function is that I cannot have a negative inside the square root. So I'll set the insides greater- 
than-or-equal-to zero, and solve. The result will be my domain: 

-2x + 3 > 0 -2x > -3 2x < 3 x< 3/2 =1.5 
Then the domain is "all x < 3/2". 

Alphabets 

The symbols are generally letters and digits. Alphabets are defined as a finite set of symbols. It is 
denoted by ‘Z’ symbol. 

E.g.: An alphabet of set of decimal numbers is given by Z = {0, 1,., 9). 

The alphabet for binary number is X={0, 1}. 

Strings 

A string or word is a finite sequence of symbols selected from some alphabets. E.g. if £={a, b} 
then i abab’ is a string over Z- A string is generally denoted by ‘w’. The empty string is the string 
with 0 (zero) occurrence of symbols. This string is represented by s or e or a 

Closure of an alphabet 

Closure of an alphabet is defined as the set of all strings over an alphabet Z including empty 

string and is denoted by Z* 

e-g. 

Let Z = {0, 1} then 

z*={£, 0,1,00, 10,01,11,....} 

Z 1 = {o, 1} 

s 2 = {00,01,10,11} 

z 3 = {000, 001, 010, Oil, 100, 101, 110, 111} 

Z + = Z'uz 2 uz 3 u.... 

Therefore, Z* = Z + U {s} 

Concatenating of string 

Let wi and W 2 be two strings, then W 1 W 2 denotes the concatenation of wi and W 2 . 

e-g- 

if wi = abc, W 2 = xyz, then 

W 1 W 2 = abcxyz 
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Languages 

A set of strings all of which are chosen from X* where X is particular alphabet, is called a 
language. 

Let X = {0, 1} then, 

L = {all strings over X with equal number of 0’s and l’s} 

= {01,0011,.} 

a s X* 

Exercises 

1. What is set? What are its types? 

2. How can we represent set? 

3. What is relation? Give examples. 

4. Explain the types of relations. 

5. What do you mean function? 

6. Define the following terms with examples: 

a. String 

b. Alphabet 

c. Symbol 

d. Keene closure 

e. Function 

f. Union 

g. Concatenation 

Methods of proof 

Theorem 

A theorem is a mathematical proposition that is true. Many theorems are conditional propositions. 
For example, if fix) and g(x) are continuous then f(x)± g(x) are also continuous. 

If theorem is of the form “if p then q”, the p is called hypothesis and q is called conclusion. 

Proof 

A proof of a theorem is a logical argument that establishes the theorem to be true. There are 
different types of proofs of a theorem. Some of them are given below: 

• Trivial proof 

• Vacuous proof 

• Direct proof 

• Indirect proof 

• Proof by contradiction 

• Proof by cases 

• Proof by mathematical induction 

• Proof by counter examples 

Trivial proof 




TPL(anithub. technovative@gmail.com) 


Page 6 











TOC preparation kit 2012 


We say is trivially true if q is true, and this kind of proof (i.e. showing q is true for without 
referring to p) is called a trivial proof. 

Consider an implication: p—>q 

If it can be shown that q is true, then the implication is always true by definition of an implication. 

Vacuous proofs 

Consider an implication: p—>q 

If it can be shown that p is false, then the implication is always true by definition of an implication. 
Note that you are showing that the antecedent is false 

35 


Direct Proofs 

To prove p—>q, we start assuming hypothesis p is true and we use information already available to 
prove q is true, and if q is true then the argument is valid. This is called direct proof. 

E.g. If a and b are odd integers, then a+b is an odd integers. 

Here a and b are odd integers. Since every odd numbers can be written by 21+1 where 1 is any 
integer. 

So, a = 2m+l 

b = 2n+l for some integers m and n 

Now, a+b = 2m+l+2n+l = 2m+2n+2 = 2(m+n+l) = 2*k where k = m+n+1 is any integer. This 
shows a+b is even. 

Indirect proof (proof by contraposition) 

Since, p—>q is equivalent to —iq—>—ip. To prove p—>q, we assume the conclusion is false; using the 
fact if p becomes false, original implication is true. 


e.g. if the product of two integers a and b is even, then either a is even or b is even. 

Suppose, if possible both a and b are odd integers. So, a = 2m+l and b = 2n+l. 

And axb = (2m+l)(2n+l) = 4mn+2m+2n+l = 2(2mn+m+n)+l = 21+1 where 1 = 2mn+m+n, which 
is not true. So, our original implication is true. 


Proof by contradiction 

The following proof proceeds by contradiction. That is, we will assume that the claim we are trying 
to prove is wrong and reach a contradiction. If all the derivations along the way are correct, then the 
only thing that can be wrong is the assumption, which was that the claim we are trying to prove 
does not hold. This proves that the claim does hold. 


Eg: For all integers n, if n2 is odd, then n is odd. 

Suppose not. [We take the negation of the given statement and suppose it to be true.] 
the contrary, that 3 an integer n such that n2 is odd and n is even. [We must 
contradiction.] By definition of even, we have 

n = 2k for some integer k. 

So, by substitution we have 


n . n = (2k). (2k) 


Assume, to 
deduce the 
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= 2 (2.k.k) 

Now (2.k.k) is an integer because products of integers are integer; and 2 and k are integers. Hence, 

n . n = 2 . (some integer) 
or n2 = 2. (some integer) 

and so by definition of n2 even, is even. 

So the conclusion is since n is even, n2, which is the product of n with itself, is also even. This 
contradicts the supposition that n2 is odd. [Hence, the supposition is false and the proposition is 
true.] 

Proof by cases 

You can sometimes prove a statement by: 

1. Dividing the situation into cases which exhaust all the possibilities; and 

2. Showing that the statement follows in all cases. 

It's important to cover all the possibilities. And don't confuse this with trying examples; an example 
is not a proof. 


Theorem. If n is a positive integer then n7 - n is divisible by 7. 

Proof: 

First we factor n7 - n = n(n6 - 1) = n(n3 - l)(n3 + 1) = n(n-l)(n2 + n + l)(n+l)(n2 - n + 1). Now 

there are 7 cases to consider, depending on n = 7 q + r where r = 0, 1, 2, 3, 4, 5, 6, 7. 

Case 1: n = 7q. Then n7 - n has the factor n, which is divisible by 7. 

Case 2: n = 7q + 1. Then n7 - n has the factor n-1 = 7q. 

Case 3: n = 7q + 2. Then the factor n2 + n + 1 = (7q + 2)2 + (7q+2) + 1 = 49 q2 + 35 q + 7 

is clearly divisible by 7. 

Case 4: n = 7q + 3. Then the factor n2 - n + 1 = (7q + 3)2 - (7q+3) + 1 = 49 q2 + 35 q + 7 is 
clearly divisible by 7. 

Case 5: n = 7q + 4. Then the factor n2 + n + 1 = (7q + 4)2 + (7q+4) + 1 = 49 q2 + 63 q + 21 
is clearly divisible by 7. 

Case 6: n = 7q + 5. Then the factor n2 - n + 1 = (7q + 5)2 - (7q+5) + 1 = 49 q2 + 63 q + 21 
is clearly divisible by 7. 

Case 7: n = 7q + 6. Then the factor n + 1 = 7q +7 is clearly divisible by 7. 

Poof by mathematical induction 

Mathematical induction is a powerful, yet straightforward method of proving statements whose 
"domain" is a subset of the set of integers. Usually, a statement that is proven by induction is based 
on the set of natural numbers. This statement can often be thought of as a function of a number n, 
where n = 1,2,3... 


Proof by induction involves three main steps: proving the base of induction, forming the induction 
hypothesis, and finally proving that the induction hypothesis holds true for all numbers in the 
domain. 

Proving the base of induction involves showing that the claim holds true for some base value 
(usually 0, 1, or 2). There are sometimes many ways to do this, and it can require some ingenuity. 
We will outline this with a simple example. 
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Theorem. A formula for the sequence an defined above, is an = (1 - l/22n)/2 for all n greater than 
or equal to 0. 

Proof. (By Mathematical Induction.) 

Initial Step. When n = 0, the formula gives us (1 - l/22n)/2 = (1 - l/2)/2 = 1/4 = aO. So the closed 
form formula ives us the correct answer when n = 0. 

Inductive Step. Our inductive assumption is: Assume there is a k, greater than or equal to zero, 
such that ak = (1 - l/22k)/2. We must prove the formula is true for n = k+1. 

First we appeal to the recurrsive definition of ak+1 = 2 ak(l-ak). Next, we invoke the inductive 
assumption, for this k, to get 

ak+1 = 2 (1 - l/22k)/2 (1 - (1 - l/22k)/2) = (1 - l/22k)(l + l/22k)/2 = (1 - l/22k+l)/2. This 
completes the inductive step. 

Proof by Counterexample 

Consider a statement of the form 
Vxe M, if P(x) then Q(x). 

Suppose that we wish to prove that this statement is false. In order to disprove this statement, we 
have to find a value of x in M for which P(x) is true and Q(x) is false. Such an x is called a 

counterexample. 

Furthermore, proving that this statement is false is equivalent to showing that its negation is true. 
The negation of the above statement is 
3x in M such that P(x) and not Q(x). 

3x M | P(x) A ~Q(x). 

Finding an x that makes the above statement true will disprove the original statement. 


Automata_Theory 

—>The study of the mathematical properties of abstract machine or automata is automata theory. 

—>In theoretical computer science, automata theory is the study of abstract machines (or more 
appropriately, abstract 'mathematical' machines or systems, as they are described in mathematical 
terms) and the computational problems that can be solved using these machines. These abstract 
machines are called automata. Automata come from the Greek word, which means "self-acting". 
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—>The figure above illustrates a finite state machine, which belongs to one well-known variety of 
automatons. This automaton consists of states (represented in the figure by circles), and transitions 
(represented by arrows). As the automaton sees a symbol of input, it makes a transition (or jump ) 
to another state, according to its transition function (which takes the current state and the recent 
symbol as its inputs). 

—>Automata theory is also closely related to formal language theory. An automaton is a finite 
representation of a formal language that may be an infinite set. Automata are often classified by 
the class of formal languages they are able to recognize. 

—>Automata play a major role in theory of computation, compiler design, parsing and formal 
verification. 

Finite-state machine 

A finite-state machine (FSM) or finite-state automaton (plural: automata ), or simply a state 
machine, is a mathematical model used to design computer programs and digital logic circuits. It is 
conceived as an abstract machine that can be in one of a finite number of states. The machine is in 
only one state at a time; the state it is in at any given time is called the current state. It can change 
from one state to another when initiated by a triggering event or condition, this is called a 
transition. A particular FSM is defined by a list of the possible transition states from each current 
state, and the triggering condition for each transition. Finite-state machines can model a large 
number of problems, among which are electronic design automation, communication protocol 
design, parsing and other engineering applications. In biology and artificial intelligence research, 
state machines or hierarchies of state machines are sometimes used to describe neurological 
systems and in linguistics—to describe the grammars of natural languages. 
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Finite Automata 


An automaton with a set of states, and its “control” moves from state to state in response to 
external “inputs” is called a finite automaton. 

A finite automaton, FA, provides the simplest model of a computing device. It has a central 
processor of finite capacity and it is based on the concept of state. It can also be given a formal 
mathematical definition. Finite automata are used for pattern matching in text editors, for compiler 
lexical analysis. 

Another useful notion is the notion of nondeterministic automaton. 


We can prove that deterministic finite automata, DFA, recognize the same class of languages as 
NDFA, ie. They are equivalent formalisms. 

It is also possible to prove that given a language L there exists a unique (up to isomorphism) 
minimum finite state automaton that accepts it, i.e. an automaton with a minimum set of states. 

The automata in the examples are deterministic, that is, once their state and input are given, their 
evolution is uniquely determined. 

Formal definition 


An automaton is represented formally by a 5-tuple (Q,X,8,qo,F), where: 

• Q is a finite set of states. 

• X is a finite set of symbols , called the alphabet of the automaton. 

• 8 is the transition function, that is, 8: Q x £ —► Q. 

• qo is the start state, that is, the state of the automaton before any input has been 
processed, where qoG Q. 

• F is a set of states of Q (i.e. F^Q) called accept states. 


Applications 

Each model in automata theory plays an important roles in several applied areas. 

• Finite automata are used in text processing, compilers, and hardware design. 

• Context-free grammar (CFGs) are used in programming languages and artificial intelligence. Originally, 
CFGs were used in the study of the human languages. 

• Cellular automata are used in the field of biology, the most common example being John Conway ’s Game of 
Fife . 

• Some other examples which could be explained using automata theory in biology include mollusk and pine 
cones growth and pigmentation patterns. 

• Going further, a theory suggesting that the whole universe is computed by some sort of a discrete automaton, 
is advocated by some scientists. 


Detui;niinistic finit e au t o m ata[ DFA) 
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In the automata theory , a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known 
as deterministic finite state machine. 

It is a finite state machine that accepts/rejects finite strings of symbols and only produces a unique computation (or 
run) of the automaton for each input string. 

Start 



The figure at above illustrates a deterministic finite automaton. In the automaton, there are three states: SO, SI, and S2 
(denoted graphically by circles). The automaton takes finite sequence of Os and Is as input. For each state, there is a 
transition arrow leading out to a next state for both 0 and 1. Upon reading a symbol, a DFA jumps deterministically 
from a state to another by following the transition arrow. For example, if the automaton is currently in state SO and 
current input symbol is 1 then it deterministically jumps to state SI. A DFA has a start state (denoted graphically by 
an arrow coming in from nowhere) where computations begin, and a set of accept states (denoted graphically by a 
double circle) which help define when a computation is successful. 


A DFA is defined as an abstract mathematical concept, but due to the deterministic nature of a DFA, it is 
implementable in hardware and software for solving various specific problems. 

DFAs can be built from nondeterministic finite automata through the power set construction . 


Formal definition 

A deterministic finite automaton M is a 5- tuple . (Q, £, 5, q 0 , F), consisting of 

• a finite set of states ( Q ) 

• a finite set of input symbols called the alphabet (£) 

• a transition function (6 : Q x E —► Q) 

• a start state (qo E Q ) 

• a set of accept states (F G Q) 

Examples & Exercises of DFA 

1. Construct a DFA to accept a string containing a zero followed by a one. 
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2. Construct a DFA to accept a string containing two consecutive zeroes followed by two consecutive 
ones . 



3. Construct a DFA to accept a string containing even number of zeroes and any number of ones. 



4. Construct a DFA to accept all strings which do not contain three consecutive zeroes. 



5. Construct a DFA to accept all strings containing even number of zeroes and even number of ones. 
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6. Construct a DFA to accept all strings (0+1)* with an equal number of 0's & l's such that 

each prefix has at most one more zero than ones and at most one more one than zeroes 



!N.Ill sfic n |te . a u to rnata 

In the automata theory , a nondeterministic finite automaton (NFA) or nondeterministic finite state machine is a finite 
state machine where from each state and a given input symbol the automaton may jump into several possible next 
states. This distinguishes it from the deterministic finite automaton (DFA), where the next possible state is uniquely 
determined. Although the DFA and NFA have distinct definitions, a NFA can be translated to equivalent DFA using 
power set construction , i.e., the constructed DFA and the NFA recognize the same formal language . Both types of 
automata recognize only regular languages . 

Non-deterministic finite state machines are sometimes studied by the name sub shifts of finite type . Non-deterministic 
finite state machines are generalized by probabilistic automata , which assign a probability to each state transition. 

Formal definition 

An NFA is represented formally by a 5-tuple . (Q, 2, A, q 0 , F), consisting of 

• a finite set of states Q 

• a finite set of input symbols Z 

• a transition relation A : Q x £ —► P(Q). 

• an initial (or start ) state qo G Q 

• a set of states F distinguished as accepting (or final) states F Q Q, 

Examples and Exercises of NFA 

1.Construct an NFA to accept all strings terminating in 01 




TPL(anithub. technovative@gmail.com) 


Page 14 






























2.Construct an NFA to accept those strings containing three consecutive zeroes . 



NFA with egsilpn..mqves(£-NFA) 

—►The NFA-e (also sometimes called NFA-X or NFA with epsilon moves) replaces the transition 
function with one that allows the empty string s as a possible input, so that one has instead 

A:0x(zu{e})-P(0. 


—►It can be shown that ordinary NFA and NFA-s are equivalent, in that, given either one, one can 
construct the other, which recognizes the same language. 


—►We can extend an NFA by introducing a "feature" that allows us to make a transition on 
, the empty string. All the transition lets us do is spontaneously make a transition, 
without receiving an input symbol. This is another mechanism that allows our NFA to be 
in multiple states at once. Whenever we take an edge, we must fork off a new "thread" 
for the NFA starting in the destination state. 


—►Just as non-determinism made NFA's more convenient to represent some problems than 

DFA's but were not more powerful; the same applies to eNFA's. While more expressive, anything we 

can represent with an eNFA we can represent with a DFA that has no £ transitions. 

—►The s (epsilon) transition refers to a transition from one state to another without the reading of 
an input symbol (ie without the tape containing the input string moving). Epsilon transitions can be 
inserted between any states. There is also a conversion algorithm from a NFA with epsilon 
transitions to a NFA without epsilon transitions. 
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Consider the NFA-epsilon move machine M = { Q, X, 5, qO, F} 
Q= { qO, ql, q2 } 

X= { a, b, c } and e moves 

qO = qO 

F={q2} 



Note: add an arc from qz to qz labeled "c" to figure above. 

The language accepted by the above NFA with epsilon moves is the set of strings over {a,b,c} 
including the null string and all strings with any number of a's followed by any number of b's 
followed by any number of c's. 

Now convert the NFA with epsilon moves to a NFA M = ( Q', X, 5', qO', F') First determine the 
states of the new machine, Q' = the epsilon closure of the states in the NFA with epsilon moves. 
There will be the same number of states but the names can be constructed by writing the state 
name as the set of states in the epsilon closure. The epsilon closure is the initial state and all states 
that can be reached by one or more epsilon moves. 

Thus qO in the NFA-epsilon becomes {q0,ql,q2} because the machine can move 
from qO to ql by an epsilon move, then check ql and find that it can move 
from ql to q2 by an epsilon move. 

ql in the NFA-epsilon becomes {ql,q2} because the machine can move from 
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ql to q2 by an epsilon move. 

q2 in the NFA-epsilon becomes {q2} just to keep the notation the same. q2 
can go nowhere except q2, that is what phi means, on an epsilon move. 

We do not show the epsilon transition of a state to itself here, but, 
beware, we will take into account the state to itself epsilon transition 
when converting NFA's to regular expressions. 

The initial state of our new machine is {q0,ql,q2} the epsilon closure of qO 

The final state(s) of our new machine is the new state(s) that contain 
a state symbol that was a final state in the original machine. 

The new machine accepts the same language as the old machine, thus same sigma. 
Exercises: 


CrPTiycrsion of NF A to D FA 

Let M 2 = < Q 2 , S, q 2 ,o , 0 2 , A 2 > be an NFA that recognizes a language L. Then the DFA M = < 
Q, E, q 0 , <5, A> that satisfies the following conditions recognizes L: 

Q = 2 q 2 , that is the set of all subsets of Q 2 , 

qo = { q2,o } , 

Upeg^Cp? fl) 

o ( q, a) = for each state q in Q and each symbol a in L and 

€ ^ 0 

A = { q Q | q f")A 2 } 


To obtain a DFA M = <Q, , q 0 , ^, A > which accepts the same language as the given NFA M 2 

= < Q 2 , B, q 2; o , O 2 , A 2 > does, you may proceed as follows: 


0 


Initially Q = . 

First put { q 2 ,o } into Q. { q 2 ,o } is the initial state of the DFA M. 

Then for each state q in Q do the following: 

Upfcq^(/h Cl) 

add the set , where 0 here is that of NFA M 2 , as a state to Q if it is not already in Q 

for each symbol a in . 


For this new state, add d ( q, a) 
NFA M 2 . 




to d , where the <5 on the right hand side is that of 




TPL(anithub. technovative@gmail.com) 


Page 17 









TOC preparation kit 2012 


When no more new states can be added to Q, the process terminates. All the states of Q that 
contain accepting states of M 2 are accepting states of M. 

Note: The states that are not reached from the initial state are not included in Q obtained by this 
procedure. Thus the set of states Q thus obtained is not necessarily equal to . 


Example 1: Let us convert the following NFA to DFA. 


a 



Initially Q is empty. Then since the initial state of the DFA is {0} , {0} is added to Q. 

Since ^2(0,a)={ 1,2} , { 1,2} is added to Q and <5({0},a)={l,2}. 

00 . 0 

Since <5 2 ( 0 , b ) = , is added to Q and <$({0},b)= . 

0 

At this point Q = { {0} , { 1,2 }, } . 

Then since { 1,2 } is now in Q, the transitions from { 1,2 } on symbols a and b are computed. 

0 

Since $ 2 (l,a)={l,2}, and <5 2 ( 2 , a ) = ,$({ l,2},a)={l,2}. Similarly fi( {l ,2 

},b)={l,3}. Thus { 1 , 3 } is added to Q . 

0 

Similarly { 1 ,3} ,a)={ 1,2} and <5 ( { 1 , 3 } , b ) = . Thus no new states are added to Q 

. Since the transitions from all states of Q have been computed and no more states are added to Q, 
the conversion process stops here. 


0 


Note that there are no states of Q 2 in 


0 


Hence there are no states that M 2 can go to from . Hence 
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0 0 0 

S( ,a)=*( ,b)= . 

For the accepting states of M, since states 0 and 1 are the accepting states of the NFA, all the states 
of Q that contain 0 and/or 1 are accepting states. Hence { 0 }, { 1,2 } and { 1 , 3 } are the 
accepting states of M. 


The DFA thus obtained is shown below. 


a 



Example 2: Similarly the NFA 


{ 13 ] 


Li 
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is converted to the following DFA: 


{ 1 , 2 , 4 } 



Regular Expression and Grammar 

There are three regular operators used to generate a language which as mentioned below:- 

1. Union (U):LiUL 2 ={S|SsLi or SsL 2 } 

2. Concatenation (,):Li.L 2 ={S.t|S s Li and t s L 2 } 


3. Kleene closure (*): L*=/=0ooZ 1 (Lo , uLi'uL 2 .) 

4. Positive closure (+): L + =/=1ooZ\LivL2vLj, .) 


Example 

If Li={11,00} ,1 2 = {01,10} over e= {0,1} 
then, 

LiUL 2 = {11,00,01, 10} 

Li.L 2 = {1101, 1110, 0001,0010} 

L*= {e, 11,00, 1111, 11011.} 
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l + = {ii,oo, 1111, non.} 

Thebasiclanguage 

The simple language is of the form {a} where a s Z and the empty language (j) and the 
language s. 


REGULAR LANGUAGE=> Basic language + Regular 
operator 


The set of regular languages over an alphabet ^ is defined recursively as below. Any language 
belonging to this set is a regular language over . 

Definition of Set of Regular Languages: 

Basis Clause: 0, {A} and {a} for any symbol a s are regular languages. 

Inductive Clause: If L r and L s are regular languages, then L, LJ L s , L r L s and L,* are regular 
languages. 

Extremal Clause: Nothing is a regular language unless it is obtained from the above two clauses. 

For example, let {a, b}. Then since {a} and {b} are regular languages, {a, b} (= {a} l.- {b}) 
and {ab} ( = {a} {b} ) are regular languages. Also since {a} is regular, {a} is a regular language 

which is the set of strings consisting of a's such as A, a, aa, aaa, aaaa etc. Note also that X! *, 
which is the set of strings consisting of a's and b's, is a regular language because {a, b} is regular. 


Regular expression 

Regular expressions are used to denote regular languages. They can represent regular languages 
and operations on them succinctly. 

The set of regular expressions over an alphabet \ is defined recursively as below. Any element of 
that set is a regular expression. 

Basis Clause: 0, A and a are regular expressions corresponding to languages 0, { A } and {a}, 
respectively, where a is an element of 1-. 

Inductive Clause: If r and s are regular expressions corresponding to languages L r and L s , then ( 
r + s ), ( rs ) and ( r) are regular expressions corresponding to languages L, l J L s , L,L S and L *, 
respectively. 

Extremal Clause: Nothing is a regular expression unless it is obtained from the above two 
clauses. 

Examples of regular expression and regular languages corresponding to them 
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2 

• ( a + b ) corresponds to the language {aa, ab, ba, bb}, that is the set of strings of length 2 

over the alphabet {a, b}. 

In general ( a + b ) k corresponds to the set of strings of length k over the alphabet {a, b}. ( 
a + b ) corresponds to the set of all strings over the alphabet {a, b}. 

• a b corresponds to the set of strings consisting of zero or more a's followed by zero or 
more b's. 

• a b + a corresponds to the set of strings consisting of zero or more a's followed by one or 
more b's followed by zero or more a's. 

• ( ab ) + corresponds to the language {ab, abab, ababab, ... }, that is, the set of strings of 
repeated ab's. 


Note: A regular expression is not unique for a language. That is, a regular language, in general, 
corresponds to more than one regular expression. For example (a + b ) and ( a b ) correspond to 
the set of all strings over the alphabet {a, b}. 


Definition of Equality of Regular Expressions 

Regular expressions are equal if and only if they correspond to the same language. 

Thus for example (a + b )* = ( a b ) , because they both represent the language of all strings over 
the alphabet {a, b}. 

In general, it is not easy to see by inspection whether or not two regular expressions are 
equal. 


Examples and Exercises^.related tp.R.E 

Ex. 1: Find the shortest string that is not in the language represented by the regular expression 

a(ab)*b*. 

Solution: It can easily be seen that A, a, b, which are strings in the language with length 1 or less. 
Of the strings wiht length 2 aa, bb and ab are in the language. However, ba is not in it. Thus the 
answer is ba. 


Ex. 2: For the two regular expressions given below, 

(a) find a string corresponding to r 2 but not to ri and 

(b) find a string corresponding to both ri and r 2 . 

r t = a* + b* r 2 = ab* + ba* + b*a + (a*b)* 

Solution: (a) Any string consisting of only a's or only b's and the empty string are in ri. So we 
need to find strings of r 2 which contain at least one a and at least one b. For example ab and ba are 
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such strings. 

(b) A string corresponding to ri consists of only a's or only b's or the empty string. The only strings 
corresponding to r2 which consist of only a's or b's are a, b and the strings consiting of only b's 
(from (a*b)*). 


Ex. 3: Let rl and r2 be arbitrary regular expressions over some alphabet. Find a simple (the 
shortest and with the smallest nesting of * and +) regular expression which is equal to each of 
the following regular expressions. 

(a) (ri + r 2 + r t r 2 + r 2 ri)* 

(b) (ri(ri + r 2 )*) + 

Solution: One general strategy to approach this type of question is to try to see whether or not they 
are equal to simple regular expressions that are familiar to us such as a, a , a + , (a + b) , (a + b) + 
etc. 

(a) Since (iq + r 2 ) represents all strings consisting of strings of iq and/or r 2 , rir 2 + r 2 ri in the 
given regular expression is redundant, that is, they do not produce any strings that are not 
represented by (iq + r 2 ) . Thus (ri + r 2 + rir 2 + r 2 r 2 ) is reduced to (iq + r 2 ) . 

(b) (ri(ri + r 2 ) ) + means that all the strings represented by it must consist of one or more strings of 
(ri(ri + r 2 )). However, the strings of (ri(ri + r 2 )) start with a string of ri followed by any number 
of strings taken arbitrarily from iq and/or r 2 . Thus anything that comes after the first iq in (ri(ri + 
T) ) + is represented by (iq + r 2 ) . Hence (ri(ri + r 2 ) ) also represents the strings of (ri(ri + r 2 ) ) + , 
and conversely (ri(ri + r 2 ) ) + represents the strings represented by (ri(ri + r 2 ) ). Hence (ri(ri + 

1*2) ) + is reduced to (ri(ri + r 2 )). 


Ex. 4: Find a regular expression corresponding to the language L over the alphabet { a ,b } 
defined recursively as follows: 

Basis Clause: A zL 

Inductive Clause: If x s L , then aabx s L and xbb s L . 

Extremal Clause: Nothing is in L unless it can be obtained from the above two clauses. 

Solution: Let us see what kind of strings are in L. First of all A s L . Then starting with A , strings 
of L are generated one by one by prepending aab or appending bb to any of the already generated 
strings. Hence a string of L consists of zero or more aab's in front and zero or more bb's following 
them. Thus (aab) (bb) is a regular expression for L. 


Ex. 5: Find a regular expression corresponding to the language L defined recursively as 
follows: 

Basis Clause: A s L and as L . 

Inductive Clause: If x s L , then aabx s L and bbx s L . 
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Extremal Clause: Nothing is in L unless it can be obtained from the above two clauses. 

Solution: Let us see what kind of strings are in L. First of all A and a are in L . Then starting with 
A or a, strings of L are generated one by one by prepending aab or bb to any of the already 
generated strings. Hence a string of L has zero or more of aab's and bb's in front possibly followed 
by a at the end. Thus (aab + bb) (a + A ) is a regular expression for L. 


Ex. 6: Find a regular expression corresponding to the language of all strings over the alphabet { 
a, b } that contain exactly two a's. 

Solution: A string in this language must have at least two a's. Since any string of b's can be placed 
in front of the first a, behind the second a and between the two a's, and since an arbitrasry string of 
b's can be represented by the regular expression b,babab is a regular expression for this 
language. 


Ex. 7: Find a regular expression corresponding to the language of all strings over the alphabet { 
a,b} that do not end with ab. 

Solution: Any string in a language over { a , b } must end in a or b. Hence if a string does not end 
with ab then it ends with a or if it ends with b the last b must be preceded by a symbol b. Since it 
can have any string in front of the last a or bb, ( a + b ) ( a + bb ) is a regular expression for the 
language. 

Ex. 8: Find a regular expression corresponding to the language of all strings over the alphabet { 
a,b} that contain no more than one occurence of the string aa. 

Solution: If there is one substring aa in a string of the language, then that aa can be followed by 
any number of b. If an a comes after that aa, then that a must be preceded by b because otherwise 
there are two occurences of aa. Hence any string that follows aa is represented by ( b + ba ) .On 
the other hand if an a precedes the aa, then it must be followed by b. Hence a string preceding the 
aa can be represented by ( b + ab ) . Hence if a string of the language contains aa then it 
corresponds to the regular expression ( b + ab ) aa( b + ba ) . 

If there is no aa but at least one a exists in a string of the language, then applying the same 
argument as for aa to a, ( b + ab ) a( b + ba ) is obtained as a regular expression corresponding to 
such strings. 

If there may not be any a in a string of the language, then applying the same argument as for aa to 
A, ( b + ab )*( b + ba )* is obtained as a regular expression corresponding to such strings. 
Altogether ( b + ab ) ( A + a + aa )( b + ba )* is a regular expression for the language. 


Ex. 9: Find a regular expression corresponding to the language of strings of even lengths over 
the alphabet of { a, b }. 
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Solution: Since any string of even length can be expressed as the concatenation of strings of 
length 2 and since the strings of length 2 are aa, ab, ba, bb, a regular expression corresponding to 
the language is ( aa + ab + ba + bb )*. Note that 0 is an even number. Hence the string A is in this 
language. 


Ex. 10: Describe as simply as possible in English the language corresponding to the regular 
expression a*b(a*ba*b) *a*. 

Solution: A string in the language can start and end with a or b, it has at least one b, and after the 
first b all the b's in the string appear in pairs. Any numbe of a's can appear any place in the string. 
Thus simply put, it is the set of strings over the alphabet { a, b } that contain an odd number of b's 


Ex. 11: Describe as simply as possible in English the language corresponding to the regular 
expression ((a + b)3)*( A + a + b). 

Solution: ((a + b ) 3 ) represents the strings of length 3. Hence ((a + b ) 3 )* represents the strings of 
length a multiple of 3. Since ((a + b ) 3 ) ( a + b ) represents the strings of length 3n + 1, where n is 
a natural number, the given regular expression represents the strings of length 3n and 3n + 1, 
where n is a natural number. 


Ex. 12: Describe as simply as possible in English the language corresponding to the regular 
expression (b + ab)*(a + ab)*. 

Solution: (b + ab )* represents strings which do not contain any substring aa and which end in b, 
and ( a + ab ) represents strings which do not contain any substring bb. Hence altogether it 
represents any string consisting of a substring with no aa followed by one b followed by a 
substring with no bb. 

Properties .jof Regular .Expressions (R.E) 

1. Commutative: 

The union of Regular expression is commutative, let L and K are two languages represented 
by R.E L and R. 

2. Associativity: 

The union and concatenation operation of R.E are associative. Let L,R,S are RE’s represented 
of languages L,R and s then, 

L+ (R+S) = (L+R) + S 
L (RS) = (LR) S 

3. Identities: 

$ is the identity from union i.e. $ + R= R + $ =R 
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8 is the identity for concatenation £R=Rs=R. 

4. Annihilator: 

An annihilator for an operation is a value such that when operator is applied with that value 
and another value, the result of operation is annihilator. 

<j> is the annihilator for concatenation then, 

<j>R=R<j>=R. 

5. Idempotent law: 

If R is R.E then R+R=R 

6. Law of closure: 

If R is R.E the ((R)*)*=R* 

<j>= closure of <j>= $ = $ 
s=closure of s= s = s 

Theoreml 

If L, M and N are any language then prove: L (M v N) = LM v LN 

Proof: 

Let w is a string such that w=xy we have to show that w s L (M u N) iff w s LM u LN 
Solution: 

if w 8 LM u LN then,w s LM and w s LN (by union rule), 
xy 8 LM then, x s L or y s M (by concatenation rule), 
xy 8 LN then, x s L and y s N (by concatenation rule). 

Hence this implies; 
xy 8 L (M u N) 

I.e. w s L (M o N) 

Proved. 

onlyif (iff): 

w 8 L (M u N) then, xy 8 L(M u N) 
x 8 L and y s (M u N). (by concatenation rule) 

if y 8 M then xy s LM 
if y 8 N then xy s LN 

so,xy s LM o LN 
Hence, w £ LM u LN 

Proved. 


Theorem 2 

Lor any R.E r,there is and 8-NLA that accepts the same language represented by r. 
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Proof:- 



2 . r=ri.r 2 



Conversion from DFA to R.E 
Arden’s Theorem 

Let p and q be two regular expression over alphabet s,if p doesn’t contain empty string then 
r= q + rp has a unique solution. 


i.e. 


r=qp* 


Proof: 


r=q + rp 

r= qr (q +rp) p=q + qp + rp 2 
Substituting r= q + rp again and again 
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r= q + qp +qp 2 + qp 3 +. 

=q(s + p + p 2 +p 3 +.) 

* 

= qp 

Proved. 

Use of Arden’s rule: 

To convert DFA into R.E there are certain assumption regarding the transition system. 
They are as follow; 

i) The transition diagram should have 8-transition. 

ii) It must have only one single starting state. 

iii) Its vertices are qi, q 2 ,q 3 ,.q n . 

iv) q, is a final state. 

v) Wij denotes the regular expressions representating the set of labels of edges from q, to q,. 
We can get the following condition. 


qi= qiwn + q 2 w 2 i + q 3 W3i +.+ q n w n i + s 

q 2 = qiwi 2 + q 2 w 2 2 + q 3 w 3 2 +.+ q n w n2 + 8 

q n = qiWi n + q 2 w 2n + q3W 3n +.+ q n w nn + 8 


Hence, solving these equations for qi in terms of Wij’s gives R.E. 
Example: 

Convert the following DFA to R.E 



qi=q 2 l+q 3 0 + 8.(i) 1 

q 2 = qiO.(ii) 

q 3 =qi 1.(iii) 

q 4 = q 2 0 + q 3 1 + q 4 0 + q 4 1.(iv) 
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Now put q 2 and q 3 in eq n (i) 
qi= qiOl + qi 10 + s 
= s + qi (01+10) 
where, 


q=£ 


r=q 


p=01 +10 

Therefore,qi=s (01 +10)* 
since, qi is the final state. 
so,R.E= s (01+ 10)* 

= (01+ 10)* is the required R.E from given diagram. 

Pumplngjemmaforregularjanguages 

The pumping lemma for regular languages describes an essential property of all regular languages. 
Informally, it says that all sufficiently long words in a regular language may be pumped that is, 
have a middle section of the word repeated an arbitrary number of times to produce a new word 
which also lies within the same language. 

Specifically, the pumping lemma says that for any regular language L there exists a constant p such 
that any word w in L with length at least p can be split into three substrings, w = xyz, where the 
middle portion y must not be empty, such that the words xz, xyz, xyyz, xyyyz, .. . constructed by 
repeating y an arbitrary number of times (including zero times) are still in L. This process of 
repetition is known as "pumping". Moreover, the pumping lemma guarantees that the length of xy 
will be at most p, imposing a limit on the ways in which w may be split. Finite languages trivially 
satisfy the pumping lemma by having p equal to the maximum string length in L plus one. 

Here's what the pumping lemma says: 

• If an infinite language is regular, it can be defined by a DFA. 

• The DFA has some finite number of states (say, n). 

• Since the language is infinite, some strings of the language must have length > n. 

• For a string of length > n accepted by the DFA, the walk through the DFA must contain a 
cycle. 

• Repeating the cycle an arbitrary number of times must yield another string accepted by the 
DFA. 
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The pumping lemma for regular languages is another way of proving that a given (infinite) 
language is not regular. (The pumping lemma cannot be used to prove that a given language is 
regular.) 

The Pumping Lemma is generally used to prove a language is not regular. 

If a DFA, NFA or NFA-epsilon machine can be constructed to exactly accept a language, then the 
language is a Regular Language. If a regular expression can be constructed to exactly generate the 
strings in a language, then the language is regular. 

If a regular grammar can be constructed to exactly generate the strings in a language, then the 
language is regular. 

To prove a language is not regular requires a specific definition of the language and the use of the 
Pumping Lemma for Regular Languages. 

A note about proofs using the Pumping Lemma: 

Given: Formal statements A and B. 

A implies B. 

If you can prove B is false, then you have proved A is false. 

For the Pumping Lemma, the statement "A" is "L is a Regular Language", 

The statement "B" is a statement from the Predicate Calculus. 

(This is a plain text file that uses words for the upside down A that reads 'for all' and the 
backwards E that reads 'there exists') 

Applying the Pumping Lemma 

Here's a more formal definition of the pumping lemma: 

If L is an infinite regular language, then there exists some positive integer m such that any string w 
e L whose length is m or greater can be decomposed into three parts, xyz, where 

• |xy| is less than or equal to m, 

• lyl > o,. 

• w; = xy'z is also in L for all i = 0, 1, 2, 3,.... 

Here's what it all means: 

• m is a (finite) number chosen so that strings of length m or greater must contain a cycle. 
Hence, m must be equal to or greater than the number of states in the dfa. Remember that 
we don't know the dfa, so we can't actually choose m; we just know that such an m must 
exist. 
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• Since string w has length greater than or equal to m, we can break it into two parts, xy and 
z, such that xy must contain a cycle. We don't know the dfa, so we don't know exactly 
where to make this break, but we know that |xy| can be less than or equal to m. 

• We let x be the part before the cycle, y be the cycle, and z the part after the cycle. (It is 
possible that x and z contain cycles, but we don't care about that.) Again, we don't know 
exactly where to make this break. 

• Since y is the cycle we are interested in, we must have |y| > 0, otherwise it isn't a cycle. 

• By repeating y an arbitrary number of times, xy*z, we must get other strings in L. 

• If, despite all the above uncertainties, we can show that the dfa has to accept some string 
that we know is not in the language, then we can conclude that the language is not regular. 


Formal statement of the Pumping Lemma: 

L is a Regular Language implies 
(there exists n)(for all z)[z in L and |z|>=n implies 
{(there exists u,v,w)(z = uvw and |uv|<=n and |v|>=l and 

i 

(for all i>=0)(uv w is in L) )}] 

The two commonest ways to use the Pumping Lemma to prove a language 
is NOT regular are: 

a) show that there is no possible n for the (there exists n), 

this is usually accomplished by showing a contradiction such 
as (n+1 )(n+1) < n*n+n 

b) show there is no way to partition z into u, v and w such that 
i 

uv w is in L, typically for a value i=0 or i=2. 

Be sure to cover all cases by argument or enumerating cases. 

Examples and Exercises related to pumping lemma for regular language. 

1. Prove that L = {O' | i is a perfect square} is not a regular language. 

Proof: Assume that L is regular and let m be the integer guaranteed by the pumping lemma. Now, 
consider the string w = 0 m2 . Clearly we L, so w can be written as w = xyz with |xy| < m and y ^ 
Mor|y| > 0). Consider what happens when i = 2. That is, look at xy 2 z. Then, we have m 2 = |w| < 

2 2 2 2 

|xy z| < m +m = m(m + l)<(m+l). That is, the length of the string xy z lies between two 
consecutive perfect squares. This means xy z £ L contradicting the assumption that L is regular. 

2. Prove that L = {ww | w € {a, b}*} is not regular. 

Assume L is regular and let m be the integer from the pumping lemma. Choose w = a m ba m b. 
Clearly, we L so by the pumping lemma, w = xyz such that |xy| < m. |y| > 0 and xy'z e L for all i 
>0. Let p = |y|. Consider what happens when i = 0. The resulting string, xz = a m p ba m b. Since p 
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> 1, the number of a’s in the two runs are not the same, and thus this string is not in L. Therefore 
L is not regular. 

3. Prove that L = {a n b n : n >0} is not regular. 

1. We don't know m, but assume there is one. 

2. Choose a string w = a n b n where n > m, so that any prefix of length m consists entirely of 
a's. 

3. We don't know the decomposition of w into xyz, but since |xy| <m, xy must consist entirely 
of a's. Moreover, y cannot be empty. 

4. Choose i = 0. This has the effect of dropping |y| a's out of the string, without affecting the 
number of b's. The resultant string has fewer a's than b's, hence does not belong to L. 
Therefore L is not regular. 


4. Prove that L = {a n b k : n > k and n^ 0} is not regular. 

1. We don't know m, but assume there is one. 

2. Choose a string w = a n b k where n > m, so that any prefix of length m consists entirely of 
a's, and k = n-1, so that there is just one more a than b. 

3. We don't know the decomposition of w into xyz, but since |xy| <m, xy must consist entirely 
of a's. Moreover, y cannot be empty. 

4. Choose i = 0. This has the effect of dropping |y| a's out of the string, without affecting the 
number of b's. The resultant string has fewer a's than before, so it has either fewer a's than 
b's, or the same number of each. Either way, the string does not belong to L, so L is not 
regular. 

5. Prove that L = {a 11 : n is a prime number} is not regular. 

1. We don't know m, but assume there is one. 

2. Choose a string w = a n where n is a prime number and |xyz| = n > m+1. (This can always be 
done because there is no largest prime number.) Any prefix of w consists entirely of a's. 

3. We don't know the decomposition of w into xyz, but since |xy| <m, it follows that |z| > 1. 

As usual, |y| > 0, 

4. Since |z| > 1, |xz| > 1. Choose i = |xz|. Then |xy‘z| = |xz| + |y||xz| = (1 + |y|)|xz|. Since (1 + 

|y|) and |xz| are each greater than 1, the product must be a composite number. Thus |xy'z| is 
a composite number. 


Context-free grammar 


Definition 

Context Free Grammar is defined by four tuple, G = (T, N, S, P) where, 
• T is set of terminals (lexicon) 
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• N is set of non-terminals For NLP, we usually distinguish out a set P C N of pre-terminals which always 
rewrite as terminals. 

• S is start symbol (one of the non-terminals) 

• R is rules/productions of the form X —► y, where X is a non-terminal and y is a sequence of terminals and 
non-terminals (may be empty). 

• A grammar G generates a language L. 

An example context-free grammar 

G = (T, N, S,R) 

T = {that, this, a, the, man, book, flight, meal, include, read, does} 

N = {S, NP, NOM, VP, Det, Noun, Verb, Aux} 

S = S 
R={ 

S —> NP VP Det —► that | this | a | the 

S —> Aux NP VP Noun —► book | flight | meal | man 

S —> VP Verb —> book | include | read 

NP —» Det NOM Aux —► does 

NOM —> Noun 

NOM -► Noun NOM 

VP -> Verb 

VP -> Verb NP 

} 

Application of grammar rules 

S —> NP VP Det —> that | this | a | the 

S —> Aux NP VP Noun —> book | flight | meal | man 

S —► VP Verb —> book | include | read 

NP —► Det NOM Aux —» does 

NOM —► Noun 

NOM Noun NOM 

VP -► Verb 

VP -► Verb NP 

S —► NP VP 
-> Det NOM VP 
-> The NOM VP 
—+ The Noun VP 
—» The man VP 
—> The man Verb NP 
—> The man read NP 

The man read Det NOM 
—> The man read this NOM 
—> The man read this Noun 
—> The man read this book 


Parse tree 
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S 




Det NOM 


i i 

The Noun 

i 

man 



Verb NP 


i 

read 



Det NOM 


i 1 

this Noun 


i 

book 


Why such grammars are called 'context free’? Because all rules contain only one symbol on the left hand side — and 
wherever we see that symbol while doing a derivation, we are free to replace it with the stuff on the right hand 
side. That is, the 'context’ in which a symbol on the left hand side of a rule occurs is unimportant — we can always use 
the rule to make the rewrite while doing a derivation. 

A language is called context free if it is generated by some context free grammar. For example, the language is 
context free. Not all languages are context free. For example, is not. 

BNF(Backus Normal Form) 

BNF (Backus Normal Form or Backus-Naur Form) is one of the two main notation techniques for context-free 
grammars, often used to describe the syntax of languages used in computing, such as computer programming 
languages, document formats, instruction sets and communication protocols. 

Backus-Naur Form is the name of many closely related Meta Languages for describing the syntax of a Programming 
Language. 

A BNF specification is a set of derivation rules, written as 
<symbol> ::= _expression_ 

where <symbol> is a nonterminal , and the expression consists of one or more sequences of symbols; more 
sequences are separated by the vertical bar, ’|’, indicating a choice, the whole being a possible substitution for the 
symbol on the left. Symbols that never appear on a left side are terminals. On the other hand, symbols that appear on a 
left side are non-terminals and are always enclosed between the pair <>. 

The meta-symbols of BNF are: 

meaning ”is defined as” 

I 

meaning ”or” 

< > 

angle brackets used to surround category names. 

The angle brackets distinguish syntax rules names (also called non-terminal symbols) from terminal symbols which 
are written exactly as they are to be represented. A BNF rule defining a nonterminal has the form: 
nonterminal ::= sequence_of_alternatives consisting of strings of 
terminals or nonterminals separated by the meta-symbol | 

For example, the BNF production for a mini-language is: 
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<program> ::= program 

<declaration_sequence> 

begin 

<statements_sequence> 
end; 

This shows that a mini-language program consists of the keyword ’’program” followed by the declaration sequence, 
then the keyword ’’begin” and the statements sequence, finally the keyword ’’end” and a semicolon. 

Examples of CFG 

Example 1 

We have shown that L = {a n b n : n 0} is not regular. Here is a context-free grammar for this language. 

G = ({S}, {a, b}, S, {S—>aSb, S^X} 

Example 2 

We have shown that L = {a n b k : k > n 0} is not regular. Here is a context-free grammar for this language. 

G = ({S, B}, {a, b}, S, {S—>aSb, S^B, B—>bB, B^b}). 

Example 3 

The language L = {ww R : w {a, b}*}, where each string in L is a palindrome, is not regular. Here is a context-free 
grammar for this language. 

G = ({S}, {a, b}, S, {S—>aSa, S^bSb, 

Example 4 

The language L = {w: w {a, b}*, n a (w) = n b (w)}, where each string in L has an equal number of a’s and b’s, is not 
regular. Consider the following grammar: 

G = ({S}, {a, b}, S, {S—>aSb, S^bSa, S^SS, S^X}). 

Example 5 

The language L, consisting of balanced strings of parentheses, is context-free but not regular. The grammar is simple, 
but we have to be careful to keep our symbols (and) separate from our meta-symbols (and). 

G = ({S}, {(,)}, S, {S-KS), s^ss, S^X}) 

Sentential Forms 

A sentential form is the start symbol S of a grammar or any string in (V u T)* that can be derived from S. 

Consider the linear grammar 

({S, B}, {a, b}, S, {S aS, S^B, B—>bB, B^X}). 

A derivation using this grammar might look like this: 

S=> aS^> aB => abB =^> abbB => abb 

Each of {S, aS, aB, abB, abbB, abb} is a sentential form. 
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Because this grammar is linear, each sentential form has at most one variable. Hence there is never any choice about 
which variable to expand next. 

Pumping Lemma for context free Language 

Size of Parse Tree : Any grammar in CNF produces a parse tree for any string that is binary tree. 

Theorem: Let w is the yield of parse tree generated by a grammar G= (V, T, P, S) in CNF, if length of the 
longest path in n, then w < 2 n l . 

Proof: By Induction 

Basis : If n=l then there consists of only a root and a leaf labeled with terminal. 

So, string w is a single terminal. 

| w | =1=2 1 ' 1 =2°=1 which is true. 

Inductive: Suppose n is the length longest path and n>l. The root of the tree uses a production A—► BC since n>l 
No path in the sub trees rooted at B and C can have length greater than n-1. Since B and C are child of A. 



n 



n-1 


v 


j 



wi 


wr 


By inductive hypothesis, the yield of these sub trees are of length at most 2 n " 2 (since n=n-1) 
So the yield of entire tree is the concatenation of these two yields i.e. 


| w | < 2 n " 2 +2 n ' 2 
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| w | < 2.2 


| w | < 2 


1+ n-2 


| w | < 2 n l .Hence Proved 


Statement of Pumping Lemma : Let L be a CFL. There exist a constant n such that if z is any string L such that 
| z | > n, there are strings uvwxy satisfying 

z = uvwxy 

I vx | >0 


| vwx | < n 

For all i>0, uvVx j y € L 

Proof: First step, we can find a CNF grammar for grammar G of L that generate L-{€} 

- Let P be the number of variables in the grammar and also n=2 p . 

Since the derivation tree for z is binary tree being CNF grammar, it must have height at least P+1. Since any parse 
tree whose lowest path is P must have a yield of length at most 2 P_1 . 

- Let us consider a path of maximum length and look at the portion bottom, consisting of a leaf node and P nodes 

above it. 
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Each of these p nodes corresponds to a variable and since there are only P distinct variables. Some variable A must 
appears twice in this portion. 

- Let w be the portion of z derived from A closest the leaf and t=vwx be the portion of z derived from other A. if u 
and y represent the beginning and ending portion of z, we have z= uvwxy. 


- The A closest to the root in this portion of the path is the root of a binary derivation tree for vwx. Since we begin 
with a path of maximum length, this tree has height < p+1 and so, | vwx | < 2 P 
| vwx | < n 


- The node containing this A has two children both correspond to a variable. If we let B denote the one that is not 
ancestor of the other A. The string of terminals derived from B doesn’t overlap x. It follows that either v or x is not 
null. So, | vx | >0 


- Finally, S 


j\uAy 


uvAxy 




uvwxy 
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First A in one for higher node, the second in lowers and so on. Third and successive derivation concludes the 
theorem i.e. uv‘wx J y € L For all i>0. 

What is derivation tree? 

Given a grammar with the usual representation G = (V, T, P, S) with variables V, terminal symbols T, set of 
productions P and the start symbol from V called S. 

A derivation tree is constructed with 

1) Each tree vertex is a variable or terminal or epsilon 

2) The root vertex is S 

3) Interior vertices are from , leaf vertices are from T or epsilon 

4) An interior vertex A has children, in order, left to right, XI, X2, ... , Xk when there is a production in P of the 
form A -> XI X2 ... Xk 

5) A leaf can be epsilon only when there is a production A -> epsilon and the leafs parent can have only this child. 

• A grammar may have an unbounded number of derivation trees. It just depends on which production is 
expanded at each vertex. 

• For any valid derivation tree, reading leafs from left to right gives one string in the language defined by the 
grammar. There may be many derivation trees for a single string in the language. 

• If the grammar is a CFG then a leftmost derivation tree exists for every string in the corresponding CFL. 
There may be more than one leftmost derivation trees for some string. If the grammar is a CFG then a 
rightmost derivation tree exists for every string in the corresponding CFL. There may be more than one 
rightmost derivation tree for some string. 

Ambiguous Grammar 

The grammar is called ’’ambiguous” if the leftmost (rightmost) derivation tree is not unique for every string in the 
language defined by the grammar. The leftmost and rightmost derivations are usually distinct but might be the same. 

Leftmost derivation Tree 

Given a grammar and a string in the language represented by the grammar, a leftmost derivation tree is constructed 
bottom up by finding a production in the grammar that has the leftmost character of the string (possibly more than one 
may have to be tried) and building the tree towards the root. Then work on the second character of the string. After 
much trial and error, you should get a derivation tree with a root S. 

Examples: Construct a grammar for L = { x 0 n y l n z n>0 } 

Recognize that 0 n y l n is a base language, say B 
B -> y | OB 1 (The base y, the recursion OBI) 

Then, the language is completed S -> xBz using the prefix, base language and suffix. 

(Note that x, y and z could be any strings not involving n) 

G = ( V, T, P, S ) where 
V = { B, S } T = { x, y, z, 0, 1 } S = S 
P = S -> xBz 
B -> y | OB 1 

* 

Now construct an arbitrary derivation for S => xOOyl lz 

G 

A derivation always starts with the start variable, S. The "=>", and ”G” stand for ’’derivation”, ’’any number of 
steps”, and ’’over the grammar G” respectively. 

The intermediate terms, called sentential form, may contain variable and terminal symbols. 
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Any variable, say B, can be replaced by the right side of any production of the form B -> <right side> 

A leftmost derivation always replaces the leftmost variable in the sentential form. 

One possible derivation using the grammar above is 
S => xBz => xOBlz => xOOBl lz => xOOyl lz 

The derivation must obviously stop when the sentential form has only terminal symbols. (No more substitutions 
possible.) The final string is in the language of the grammar. But, this is a very poor way to generate all strings in the 
grammar! 

A ’’derivation tree” sometimes called a ’’parse tree” uses the rules above: start with the starting symbol, expand the 
tree by creating branches using any right side of a starting symbol rule, etc. 

S 

/ ] \ 

/ I \ 

/ I \ 

/ I \ 

/ I \ 

x B z 

n a 

/ I \ 

/ I \ 

/ I \ 

0 B 1 

/|A 

/ I \ 

0 B 1 

I 

y 

Derivation ends x 0 0 y 1 1 z with all leaves terminal symbols, a string in the language generated by the 

grammar. 

Example 2: 

Given G = (V, T, P, S) V={S, E, 1} T={a, b, c} S=S 

P= 

I -> a | b | c 
E -> 11 E+E | E*E 

S -> E (a subset of grammar from book) 


Given a string a + b * c 
S 


E 

/1 \ 

/1 \ 
/ i \ 
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E * E 

/|\ I 

E + E I 

I I I 

I I c 

I I 

a b 


Example 3: Leftmost and rightmost derivation 

Now consider the grammar 

G = ({S, A, B, C}, {a, b, c}, S, P) where = {S^ABC, A—>aA, A—>X, B—>bB, B->A, ,C^cC, C->A, }. 

With this grammar, there is a choice of variables to expand. Here is a sample derivation: 

S => ABC => aABC => aABcC => aBcC => abBcC=> abBc=> abbBc => abbc 
If we always expanded the leftmost variable first, we would have a leftmost derivation: 

S => ABC=> aABC=> aBC => abBC=> abbBC => abbC => abbcC=> abbc 

Conversely, if we always expanded the rightmost variable first, we would have a rightmost derivation: 

S => ABC => ABcC => ABc => AbBc => AbbBc => Abbc => aAbbc=> abbc 

There are two things to notice here: 

• Different derivations result in quite different sentential forms , but 

• For a context-free grammar , it really doesn f t make much difference in what order we expand the variables. 

Example: 

The context free grammar 

A—> A + A| A-A|a 

is ambiguous since there are two leftmost derivations for the string a + a + a: 

A —► A + A 

—► A + A + A (First A is replaced by A+A. Replacement of the second A would yield a 
similar derivation) 

—► a + A + A 

—► a + a + A 

—► a + a + a 

As another example, the grammar is ambiguous since there are two parse trees for the string a + a - a: 


A —» A + A 

—» a + A 

a + A + 
A 

—» a + a + 
A 

—» a + a + 
a 
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The language that it generates, however, is not inherently ambiguous; the following is a non-ambiguous 
grammar generating the same language: 


A—>A + a|A-a|a 


Chomsky Normal Form 

It is convenient to assume that every context-free grammar can without loss of generality be put into a special format, 
called a normal form. One such format is Chomsky Normal Form. In CNF, we expect every production to be of the 
from A -> BC or D -> d, where A, B, C and D are non-terminal symbols and d is a non-lambda terminal symbols. If 
lambda (the empty string) is actually part of the language, then S -> X is allowed. 

We will use CNF in three different places: 

• A proof of a pumping lemma for CFG’s. 

• A proof that every language generated by a CFG can be accepted by a non-deterministic pushdown machine. 

• An algorithm (dynamic programming style) for determining whether a given string is generated by a given 
context free grammar. 

There are many steps needed to turn an arbitrary CFG into CNF. The steps are listed below: 

• Get rid of Useless symbols. 

• Get rid of lambda-productions. 

• Get rid of Unit Productions. 

• Get rid of Long Productions. 

• Get rid of Terminal symbols. 

The steps are completely algorithmic with step one repeatable after each of the other steps if necessary. We will do a 
complete example in class. 

Useless Symbols - 

Delete all productions containing non-terminal symbols that cannot generate terminal strings. 

Delete all productions containing non-terminal symbols that cannot be reached by S. 

The details of the two steps for finding useless symbols are very similar and each is a bottom-up style algorithm. To 
find all non-terminal symbols that generate terminal strings, we do it inductively starting with all non-terminal 
symbols that generate a single terminal string in one step. Call this set T. Then we iterate again looking for 
productions whose right sides are combinations of terminal symbols and non-terminal from T. The non-terminals on 
the left sides of these productions are added to T, and we repeat. This continues until T remains the same through an 
iteration. 


To find all non-terminal symbols that can be reached by S, we do a similar thing but we start from S and 
check which non-terminals appear on the right side of its productions. Call this set T. Then we check which non¬ 
terminals appear on the right side of productions whose left side is a non-terminal in T. This continues until T remains 
the same through an iteration. 
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The steps need to be done in this order. For example, if you do it in the opposite order, then the grammar S 
->AB, S -> 0, A -> 0, would result in the grammar S -> 0, A -> 0. If we do it in the correct order, then we get the 
right answer, namely just S -> 0. 

Subsequent steps may introduce new useless symbols. Hence Useless symbol removal can be done after each of the 
upcoming steps to ensure that we don’t waste time carrying Useless symbols forward. 

Lambda-Production Removal 

The basic idea is to find all non-terminals that can eventually produce a lambda (nullable non-terminals), and 
then take every production in the grammar and substitute lambdas for each subset of such non-terminals. We add all 
these new productions. We can then delete the actual lambda productions. For example A -> 0N1N0 N -> A. We 
add A -> ON 10 | 01N0 | 010, and delete N ^ X 

The problem with this strategy is that it must be done for all nullable non-terminals simultaneously. If not, 
here is a problem scenario: S->0|X1|0Y0 X -> Y | A Y -> 1 | X. In this case, when we try to 

substitute for X -> A, we add S -> 1 and Y -> A, and then we sub for Y -> A, we add S -> 00 and X -> A. The trick is 
to calculate all nullable non-terminals at the start which include X and Y, and then sub for all simultaneously, deleting 
all resulting lambda productions, except perhaps for S -> A. If lambda is actually in the language, then we add a 
special start symbol S’ -> S | A, where S remains the old start symbol. 

Unit Productions 

To get rid of Unit productions like A -> B, we simply add A-> anything, for every production of the form B 
-> anything, and then delete A -> B. The only problem with this is that B might itself have Unit productions. Hence, 
like we did in lambda productions, we first calculate all Unit non-terminals that A can generate in one or more steps. 
Then the A -> anything productions are added for all the Unit non-terminals X in the list, where X -> anything, as 
long as anything is not a Unit production. The original Unit productions are then deleted. The Unit non-terminals that 
can be generated by A can be computed in a straightforward bottom-up manner similar to what we did earlier for 
lambda productions. 

For example, consider S -> A | 11 A-> B | 1 B->S|0. The Unit non-terminals that can be 

generated by S, A and B are A,B and B,S and A,S respectively. So we add S -> 1 and S -> 0; A -> 11 and A -> 0; 
and B -> 1 and B -> 11. Then we delete S -> A, A -> B and B -> S. 

Long Productions 

Now that we have gotten rid of all length zero and length one productions, we need to concentrate on length > 
2 productions. Let A -> ABC, then we simply replace this with A -> XC and X -> AB, where X is a new non¬ 
terminal symbol. We can do this same trick inductively (recursively) for longer productions. 

If we have long terminal productions or mixed terminal/non-terminal productions, then for each terminal 
symbol, say our alphabet is {0,1}, we add productions M -> 0 and N -> 1. Then all 0’s and l’s are replaced by M’s 
and N’s, where M and N are new non-terminal symbols. 


Pushdown automata 


Definition 

A pushdown automaton is a system A = ( Q, E, T, s, A, F) where 

• Q is a finite set of states, 

• • E is an input alphabet, 

• * T is a stack alphabet, 

• * s E Q is an initial state, 
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• -A is a transition relation: 

• A C (0 x (2 u {e}) x r*) x (g x r*), 

• • F C Q is a set of final states. 

It is useful to imagine that the automaton has a “stack” which can be filled by letters in r at one end, and the letters can 
be extracted from the stack at the same end, according to the principle “first in - last out”. Consider a typical transition 
((p, a, a) x (q, ft)) E A, where p, q E Q. a 6IU fe}, a, ft ET How it works: if in a state p the observed letter is a 
and the word at the top of the stack is a. Then the transition instructs the automaton to adopt the state q, move one 
position to the right along the word, and replace in the stack the word a by ft. In particular, the transition ( (p. a, e), (q, 
b)) pushes b into the stack, while ((p. a, b ), (q, e)) “pops” b out of the stack. 


Example 1 

We construct a pushdown automaton accepting the language {a n b n G {a, b} * | n > 0}. 
Let Q = {s, p, f}, £ = {a, b}, T = {a}, F = {s, f}, 

A={ 

(1) ((s, a, e), (p, a)), 

(2) ((p, a, e), (p, a)), 

(3) (p, b, a), (f, e)), 

(4) (f, b, a), (f, e)) 


We represent a computation with the input word aabb in the form of a table: 


State 

Input remaining 

Stack 

Transition 

s 

aabb 

e 

- 

P 

abb 

a 

1 

P 

bb 

aa 

2 

f 

b 

a 

3 

f 

e 

e 

4 


Thus the string is accepted. 


Example 2: 

We construct a pushdown automaton accepting the language {w G {a, b} *1 w has the same number of a’s as b’s}. 

The idea: push into the stack the current excess of a’s or b’s (as the case may be). 

To decide whether there is excess, there should be a way of deciding whether the stack is empty or not. This is dine by 
introducing a new stack letter, say c, to serve as the bottom of the stack. 

Let 2 = {a, b}, T = {a, b, c}, Q = {s, q, f}, F = {f}, 

A={ 

(1) ((s, e, e), (q, c)), 

(2) ((q, a, c), (q, ac)), 

(3) ((q, a, a), (q, aa)), 

(4) ((q, a, b), (q, e)), 

(5) ((q, b, c), (q, be)), 

(6) ((b,b, b), (q, bb)), 


State 

Input remaining 

Stack 

Transit ion 

s 

aabbba 

e 

— 

Q 

aabbba 

c 

1 

Q 

ctbbba 

CLC 

2 

Q 

bbbcz 

CLCLC 

2 

Q 

bba 

CLC 

7 

Q 

bet 

C 

7 

Q 

CL 

be 

5 

Q 

e 

c 

4 

f 

e 

e 

8 


(7) ((q, b, a), (q, e)), 

(8) ((q, e, c), (f, e))}. 
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We see that the string aabbba is accepted by the grammar. 

From grammars to automata 

Theorem. There exists an algorithm which for any context-free grammar G constructs a pushdown automaton A such 
that L(G) = L(A). 

Sketch of a proof: Let G = (X,NT,R,S) be a context-free grammar. Define the automaton A as ({s, f }, X, T = X U N T , 
A, {f }), where 

A={ 

(1) ((s, e, e), (f, S)), 

(2) ((f,e,C),(f,w)) for each rule C —► w in R, (3) ((f, a, a), (f, e)) for each a G X}. 

Informally, the automaton A simulates a “leftmost derivation” of the input word. A transformation of the type (2) 
simulates one step of the derivation, transformation of the type (3), maybe repeated several times, prepares the stack 
for another step of the derivation. 

Example 3: 

Consider the language of even-length palindromes: L = {ww R | w G {a, b} * }. The grammar G generating L is 
determined by the following set of rules: 

{S —> aSa, S —► bSb, S —► e}. 

According to the theorem, the transitions of a pushdown automaton accepting L can be obtained from this list of rules 
as follows: 

(1) ((s, e, e), (f, S)), 

(2) ((f, e, S), (f, aSa)), 

(3) ((f,e, S), (f, bSb)), 

(4) ((f, e, S), (f, e)), 

(5) ((f, a, a), (f, e)), 

(6) ((f,b, b), (f,e)). 

Apply A to the input abba: 


State 

Input remaining 

Stack 

Transition 

s 

abba 

e 

— 

f 

abba 

S 

1 

f 

abba 

aSa 

2 

f 

bba 

Sa 

5 

f 

bba 

bSba 

3 

f 

ba 

Sba 

6 

f 

ba 

ba 

4 

f 

a 

a 

6 

f 

e 

e 

5 


Turing Machine 

A Turing Machine (TM) is an abstract, mathematical model that describes what can and cannot be computed. A 
Turing Machine consists of a tape of infinite length, on which input is provided as a finite sequence of symbols. A 
head reads the input tape. The Turing Machine starts at “start state” SO. On reading an input symbol it optionally 
replaces it with another symbol, changes its internal state and moves one cell to the right or left. 

Notation for the Turing Machine: 
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TM = {Q, E, T, 8, qO, h} 

Q = a set of TM states 
X= set of symbols 
r = a set of tape symbols 
qO = the start state 
h =halting state 
8 = transition functions 

Some of the characteristics of a Turing machine are: 

1. The symbols can be both read from the tape and written on it. 

2. The TM head can move in either direction - Left or Right. 

3. The tape is of infinite length 

4. The special states, Halting states and Accepting states, take immediate effect. 

Design a TM that erases all non-blank symbols on the tape, where the sequence of non-blank symbols does not 
contain any blank symbol # in between: 

TM = {Q, E, T, 8, qO, h} 

Q={qO, h} 

£={a,b} 
r={a, b, #} 
qO is initial state 


q: states 

a: input symbol 

8 (q, a) 

qO 

a 

{qo, #, L} 

qO 

b 

{qo, #, L} 

qO 

# 

{h, #, N} 

h 

# 

ACCEPT 


Design a TM that recognizes the language of all strings of even length over alphabet {a, b} 
TM = {Q, E, T, 8, qO, h} 

Q={qO,ql,h} 

£={a,b} 
r={a, b, #} 
qO is initial state 


q: states 

a 

b 

# 

qO 

(ql,a, L) 

(ql,b, L) 

{qo, #, L} 

qi 

(qO, a, L) 

(qO, b, L) 

* 

h 

* 

* 

ACCEPT 


Design a TM that recognizes the language of all strings, which contains ‘aba ’ as a substring. 
TM = {Q, E, T, 8, qO, h} 

Q={qO, ql,q2, h} 

£={a,b} 
r={a, b, #} 
qO is initial state 




(h, a, L) 

CqO, b, L) 

0i, #,N)- 


a. alalto 

t - 

t 



lO 

fnl q I 'i 

(nO h Li 

ACCEPT 


jyj 

V4 A , L / 

\qu, u, 


ql 

(ql, a, L) 

(q2, b, L) 

* 
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Design a TM which compute the function f(m) =m + 1 for each m that belongs to the set of natural numbers . 
Given the function f(m) = m + 1. Here we represent the input m on the tape by a number of I’s on the tape. 

For, example, if m =1, input will be #1# and If m = 2, input will be #11# 

TM = {Q, £, T, 8, qO, h} 

Q={qO, h} 

r={i, #} 

qO is initial state 


q: states 

a=0 

a=# 

qO 

(ql,I,R) 

(ql,#,R) 

qi 

* 

(h, I, R) 

h 

* 

* 


Design a TM that replaces every 0 with 1 and 1 with 0 in a binary string. 
TM = {Q, £, T, 8, qO, h} 

Q={qO,ql,h} 

Z={0, 1} 

r={0,1,#} 

qO is initial state 


q: states 

0 

1 

# 

qO 

(q0, 1, L) 

(qO, 0, L) 

(ql,#,R) 

qi 

(ql,l,R) 

(ql,0,R) 

(h, #, N) 

h 

* 

* 

ACCEPT 


Given a string of Is on a tape (followed by an infinite number of Os), add one more 1 at the end of the string. 

Input: #111100000000. 

Output: #1111100000000. 

Initially the TM is in Start state SO. Move right as long as the input symbol is 1. When a 0 is encountered, 
replace it with 1 and halt. 

Transitions: 

(SO, l )-> (SO, 1,R) 

(SO, 0)-> (SO, 1, STOP) 


P AND NP CLASS PROBLEMS 

Up to now we were considering on the problems that can be solved by algorithms in worst-case polynomial 
time. There are many problems and it is not necessary that all the problems have the apparent solution. This 
concept, somehow, can be applied in solving the problem using the computers. The computer can solve: 
some problems in limited time e.g. sorting, some problems requires unmanageable amount of time e.g. 
Hamiltonian cycles, and some problems cannot be solved e.g. Halting Problem. In this section we 
concentrate on the specific class of problems called NP complete problems. 

Tractable and Intractable Problems 

We call problems as tractable or easy, if the problem can be solved using polynomial time algorithms. The 
problems that cannot be solved in polynomial time but requires super-polynomial time algorithm are called 
intractable or hard problems. There are many problems for which no algorithm with running time better than 
exponential time is known some of them are, traveling salesman problem, Hamiltonian cycles, and circuit 
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satisfiability, etc. 

P and NP classes and NP completeness 

The set of problems that can be solved using polynomial time algorithm is regarded as class P. The problems 
that are verifiable in polynomial time constitute the class NP. The class of NP complete problems consists of 
those problems that are NP as well as they are as hard as any problem in NP (more on this later). The main 
concern of studying NP completeness is to understand how hard the problem is. So if we can find some 
problem as NP complete then we try to solve the problem using methods like approximation, rather than 
searching for the faster algorithm for solving the problem exactly. 

Problems 

Abstract Problems 

Abstract problem A is binary relation on set I of problem instances, and the set S of problem solutions. For 
e.g. Minimum spanning tree of a graph G can be viewed as a pair of the given graph G and MST graph T. 

Decision Problems 

Decision problem D is a problem that has an answer as either “true”, “yes”, “1” or “false”, ”no”, “0”. For e.g. 
if we have the abstract shortest path with instances of the problem and the solution set as {0,1}, then we can 
transform that abstract problem by reformulating the problem as “Is there a path from u to v with at most k 
edges”. In this situation the answer is either yes or no. 

Optimization Problems 

We encounter many problems where there are many feasible solutions and our aim is to find the feasible 
solution with the best value. This kind of problem is called optimization problem. For e.g. given the graph G, 
and the vertices u and v find the shortest path from u to v with minimum number of edges. The NP 
completeness does not directly deal with optimizations problems, however we can translate the optimization 
problem to the decision problem. 

Complexity Class P 

Complexity class P is the set of concrete decision problems that are polynomial time solvable by 
deterministic algorithm. If we have an abstract decision problem A with instance set I mapping the set {0,1}, 
an encoding e: I—>{0,1}* is used to denote the concrete decision problem e(A). We have the solutions to 
both the abstract problem instance iEl and concrete problem instance e(i) E{0,1}* as A(i)E{0,l}. It is 
important to understand that the encoding mechanism does greatly vary the running time of the algorithm for 
e.g. take some algorithm that runs in O(n) time, where the n is size of the input. Say if the input is just a 
natural number k, then its unary encoding makes the size of the input as k bits as k number of l’s and hence 
the order of the algorithm’s running time is O(k). In other situation if we encode the natural number k as 
binary encoding then we can represent the number k with just logk bits (try to represent with 0 and lonly) 
here the algorithm runs in O(n) time. We can notice that if n = logk then O(k) becomes 0(2n) with unary 
encoding. However in our discussion we try to discard the encoding like unary such that there is not much 
difference in complexity. 

We define polynomial time computable function f: {0,1} *—>{0,1}* with respect to some polynomial time 
algorithm PA such that given any input x E {0,1 }*, results in output f(x). 

For some set I of problem instances two encoding el and e2 are polynomially related if there are two 
polynomial time computable functions f and g such that for any i El, both f(el(i)) = e2(i) and g(e2(i)) = el(i) 

are true i.e. both the encoding should computed from one encoding to another encoding in polynomial time 
by some algorithm. 
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Polynomial time reduction 

Given two decision problems A and B, a polynomial time reduction from A to B is a polynomial time 
function f that transforms the instances of A into instances of B such that the output of algorithm for the 
problem A on input instance x must be same as the output of the algorithm for the problem B on input 
instance f(x) as shown in the figure below. If there is polynomial time computable function f such that it is 
possible to reduce A to B, then it is denoted as A <p B. The function f described above is called reduction 
function and the algorithm for computing f is called reduction algorithm. 



Complexity Class NP 

NP is the set of decision problems solvable by nondeterministic algorithms in polynomial time. When we 
have a problem, it is generally much easier to verify that a given value is solution to the problem rather than 
calculating the solution of the problem. Using the above idea we say the problem is in class NP 
(nondeterministic polynomial time) if there is an algorithm for the problem that verifies the problem in 
polynomial time. 

PREVIOUS YEAR QUESTION SOLUTIONS 

CSC-251-2067 (FIRST BATCH) 


Attempt all the questions 


Group A 

1. Define Finite Automata with £ moves. Is £ NFA has more computation power than DFA? 

An e NFA can be represented as: 

A=(Q, I, 5, q 0 , F) where, 

Q=a finite set of states. 

X=a set of input symbols 

5=a function that takes as arguments: 

i. A state in Q and 

ii. A member of £ U{s}, that is either an input symbol or the symbol 8. 
q 0 =an initial state that c Q 

F= a set of final states or accepting state. 

It appears that s-NFA is more powerful than DFA but this is not the case. Because, say we have two machine 
A and B, then 

B is less powerful than A if and only if: 

• Some A can recognize every language a B can recognize. 
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• There is some language that can be recognized by an A but not by any B. 

Since, every language accepted by e-NFA is also accepted by DFA, they are of equal powers. 

2. Give the DFA accepting the strings over {a, b} such that each string does not start with ab. 



following 

languages: 

a) L = {S|S g {a, b}* and S starts with aa or b and does not contain substring bb. 

Ans: (aa+b)(a ab) 

b) L - {S|S e {0, 1 }* and 0 occurs in pairs if any and ends with 1. 

Ans: 1*(00)*1*(00)*1 

4. Convert following regular grammar into Finite Automata 

S—> aaB | aB | s 
B—► bb | bS | aBB 
See note 

5. Convert following grammar into an equivalent PDA. 

S—>AAC A—>aAb | 8 C^aC | b | ab 

See Note 

6. What is a multi track Turing Machine? How it differs with single-track machine? 

A Multi-track Turing Machine is a specific type of Multi-tape Turing Machine. In a standard n-tape Turing 
machine, n heads move independently along n tracks. In an n-track Turing machine, one head reads and 
writes on all tracks simultaneously. A tape position in an n-track Turing Machine contains n symbols from 
the tape alphabet. It is equivalent to the standard Turing machine and therefore accepts precisely the 
recursively enumerable languages. 

Formal definition 

A multitape Turing machine can be formally defined as a 6-tuple, where 

■ Q is a finite set of states 

■ X is a finite set of symbols called the tape alphabet 

■ qO is the initial state 
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■ FczQ is the set of final or accepting states. 

■ 8e(Q\A x X) x (Q x X x d) is a relation on states and symbols called the transition relation. 

■ Te Q is the set of tape symbols 

A single track TM machine is a TM with single tape with only one track. 


7. Construct a Turing Machine that accepts the language of palindrome over {a, b}* with each string of 
odd length. 

See Chapter Turing Machine examples 

8. What is an algorithm? Explain on the basis of Church Hypothesis. 

According to Church, “No computational procedure will be considered an algorithm unless it can be 
represented by a Turing Machine.” 

In other word, an algorithm is a module that is accepted by some Turing machine. This connection between 
the informal notion of algorithm and the precise definition is known as the Church-Turing thesis. More 
precisely, Turing proposed to adopt the Turing machine that halts on all inputs as the precise formal notion 
corresponding to our intuitive notion of an “algorithm”. 

Note that the Church-Turing thesis is not a theorem, but a “thesis”, as it asserts that a certain informal concept 
(algorithm) corresponds to a certain mathematical object (Turing machine). Not being a mathematical 
statement, the Church-Turing thesis cannot be proved! What are the implications of this fact? 

Well, it is in principle possible that the Church-Turing thesis can be disproved. 


Group B 

9. How a £-NFA can be converted into NFA and DFA? Explain with a suitable example. 

Conversion of e-NFA to corresponding DFA: 

In order to convert s-NFA to corresponding DFA, following steps should be taken into 
consideration: 

i. The e-closure of starting state should be determined & union of e-closure of each transitions for 
every symbols of X should be calculated. 

ii. The same process should be repeated till new states appear. 



Firstly, 

e-closure of starting state is 
s-closure(qo)={qo,qi,q 2 } = e(q 0 ) 
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Similarly, other closures as: 

e(qi)={ qi.qa} 

s(q 2 )={q 2 } 

Then, 8 c ({q 0 ,qi,q2}, a) = e(q 0 ) U cp U e(q 2 ) 

= {qo,qi,q 2 }U {q 2 } 

= {qo,qi,q 2 } 

§c({qo,qi,q 2 }, b) = <p U s(qO U cp = { q b q 2 } 

§c(qi,q 2 }, a) = 9 U s(q 2 ) = {q 2 } 

§c(qi,q 2 }, b) = e(qi) U cp = { q b q 2 } 

§c(q 2 }, a) = s(q 2 ) = {q 2 } 

8 c (q 2 },b) = (p 
Then, 



10. Find the minimum state DFA equivalent to the following DFA. 


State 

0 

1 

—►A 

B 

C 

B 

B 

D 

C 

E 

D 

D 

E 

D 


TPL(anithub. technovative@gmail.com] 


Page 52 





















TOC preparation kit 2012 


*E 


A 


D 


We have, 



Here, we don’t have any unreachable state. Then, separating final and non-final states as 


State 

0 

1 

State 

0 

—►A 

B 

C 

E* 

A 

B 

B 

D 



C 

E 

D 



D 

E 

D 




1 

D 


Then eliminating equivalent states as: 
State 0 1 

A B C 

B B C 


Again, 

State 0 1 

A AC 


State 0 1 

C EC 

State 0 1 

E* AC 
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C EC 

Now, merging these two tables as 


State 

A 

C 

E* 


0 

A 

E 

A 


1 

C 

C 

c 


Now, final diagram is 


1 



11. Show that a language L is accepted by some DFA if and only if L is accepted by some NFA. 

See the chapters Finite Automata 

12. Define the language of PDA that accepts by Final State. Explain, how a PDA accepting by empty stack 
can be converted into a PDA by final state. 

From Empty Stack to Final State: 

Theorem: If L = N(PN) for some PDA PN= (Q, X, T , 5N, qO, ZO), then there is a PDA PF such that L = (PF) 
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The method of conversion is given in figure. 

We use a new symbol XO, which must be not symbol of T to denote the stack start symbol for PF. Also add a new 
start state pO and final state pf for PF. Let PF = (Qu{pO, pf}, X, ru{X0}, 5F, pO, XO, {Pf}), where 5F is defined 
by 

5F(p0,8,X0) = {(qO, ZO XO)} to push XO to the bottom of the stack 

5F(q, a, y) = 5N(q, a, y) a e X or a =e ( and y e T, same for both PN and PF. 

5F(q, 8, XO) = {(Pf, 8,)} to accept the string by moving to final state. 

The moves of PF to accept a string w can be written like: 

(pO, w, XO) |-PF (pO, w, ZOXO) |-*PF (q, e„ XO) |- (Pf, 8„ 8,) 

From Final State to Empty Stack: 

Theorem: If L = L(PF) for some PDA PF= (Q, X, T, 5F, qO, ZO, F), then there is a PDA PN such that L = N(PN) 



The method of conversion is given in figure. 

To avoid PF accidentally empting its stack, initially change the stack start content from ZO to ZOXO. Also add a 
new start state pO and final state p for PN. Let PN = 

(Qu{pO, p}, X, Tu{X0}, 6N, pO, XO) 
where 5N is defined by: 
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5N(pO, 8, XO) = {(qO, ZOXO)} to change the stack content initially 

5N(q , a, y) = 5F(q , a, y), ae X or a = 8 and y e T, same for both 

5N(q , e, y) = {(p , e)}, q e F,ye T or y = XO , same for both 

5N(p , e, y) = {(p, e)}, y e T or y = XO, to pop the remaining stack contents. 

The moves of PN to accept a string w can be written like: 

(pO, w, XO) |-PN (qO, w, ZOXO) |-*PN (q, 8, XO) |- (p, 8, e) 

13. Explain about multi tape TM. Show that every language accepted by a multi-tape Turing Machine is 
also accepted by one tape Turing Machine. 

For solution see the respective chapters 

14. Write short notes on: 

(a) Decidable vs. Un-decidable problems. 

(b) Unrestricted Grammar. 

(c) NP-Completeness. 

(d) CNF-SAT Problem. 

For solution see the respective chapters 


TOC (2067) Second Batch 


Group A 


1. What is DFA? How it differ with a NFA? Explain. 

A DFA is defined as a five tuples (Q, £, 5, q G , F) where, 

Q = finite set of states 
X = finite set of input symbols 
5 = transition function that maps Q*X~ 
q 0 = starting state, q 0 cQ 

F = final state (accepting state) F is a subset of Q 

DFA differs with a NFA by the type of value that transition function returns. In DFA, the transition function 
takes a state in Q and an input symbol in £ as arguments and returns a set of states. It means that the 
transition function takes DFA from one state to another state while it takes NFA from one state to several 
other states. 

For DFA, 


5 (qO, a) = ql 



For NFA, 


5 (qO, a) = qO 
5 (qO, a) = ql 
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Here, from the above figure in the DFA the state ql is reached when the input symbol ‘a’ is given to the state 
qO. And in NFA the state qO and ql is reached when the input symbol ‘a’ is given to the state qO. It shows 
that the DFA can have only one state as output for any input symbol and the NFA can have more than one 
output for any input symbol. 

2. Give the DFA for language of strings over {0,1} in which each strings end with 11. 



3. For a regular expression (a+b)*baa, construct s-NFA. 



4. Define the term parse tree, regular grammar, sentential form and ambiguous grammar. 

Parse tree 

Parse tree is the tree representation of production defined by a grammar. It is very important for syntax 
analysis of any programming languages. The parse tree is the tree with the following condition: 

— Each interior node of a parse tree are variables 

— Each leaf node is labeled with E or a terminal strings. If labeled with t then it is only child of its parents. 
Regular grammar 

A regular grammar is a CFG which may be left linear or right linear. A grammar in which all productions are 
of the form A—>wB or A—»w where, A, B e V and weT* is called right linear grammar. 

If all the productions are of the form A—>Bw or A—»w where, A, B e V and weT* is left linear grammar. 
Regular grammar always represents a language that is accepted by finite automata which is called regular 
language. 

S —> OS | IS | 0 (right linear grammar) 

S —> SO | SI | 0 (left linear grammar) 

Sentential form 

The derivation from the start symbol produce strings that have a special rule. We call these the “sentential 
form”. That is, G = (V, T, P, S) is a CFG, then any string a in (V U T)* such that S *—► a is a sentential form. 

Ambiguous grammar 

A CFG is called ambiguous if for at least one word in the language that it generates there are two or more 
possible derivations of the word that correspond to different syntax trees. 

5. Give the formal definition of NPDA. How it differs with DPDA? Explain. 

A nondeterministic pushdown automaton (npda) is basically an nfa with a stack added to it. 

A nondeterministic pushdown automaton or npda is a 7-tuple 

M = (Q, 2, r, 5, qo, Z, F) 

where 
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• Q is a finite set of states, 

• 2 is a the input alphabet, 

• IT is the stack alphabet, 

• 5 is a transition function, 

• qo ^ Q is the initial state, 

• z^Tis the stack start symbol, and 

• F £ Q is a set of final states. 

The NPDA differs with DPDA by the following points: 

A PDA is deterministic if there is never a choice of move in any situation. It means that if 5 (q, a, X) contains 
more than one pair, then surely the PDA is nondeterministic because we can choose among these pairs when 
deciding on the next move. However, even if 5 (q, a, X) is always a singleton, we could still have a choice 
between using a real input symbol, or making a move on e. Hence, PDA is deterministic if and only if the 
following conditions are met: 

— 5 (q, a, X) has at most one member for any q in Q, a in £ or a = e, and X in T. 

— If 5 (q, a, X) is nonempty, for some a in £, then 5 (q, e, X) must be empty. 

6. Construct a Turing Machine that accepts a language of string over (a, b) with each string of even 
length. Show how it accepts string abab. 

Design a TM that recognizes the language of all strings of even length over alphabet {a, b} 

TM = {Q, X, r, 8, qO, h} 

Q={qO, ql,h} 

£={a,b} 
r={a, b, #} 
qO is initial state 


q: states 

a 

b 

# 

qO 

(ql,a, L) 

(ql,b, L) 

{qo, #, L} 

qi 

(qO, a, L) 

(qO, b, L) 

* 

h 

* 

* 

ACCEPT 


7. G 

ive the formal definition of Turing Machine. How it differs from PDA? 

A Turing Machine has a 7-tuple M = (Q, £, T, 5, q 0 , B, F) where 
Q : The finite set of states of the finite control 
X: The finite set of input symbols 
T : The complete set of tape symbols 

5 : The transition function defined by Q*T—>Q*r*(R,L,S) where R,L,S is the direction of head-left or right or 

stationery i.e. 5 (q, X) = 8 (p, Y, D) 

q 0 = the start state 

B = the blank symbol BeT 

F = the set of final accepting state 

The Turing Machine differs from PDA from the following points: 

— The cell of the tape of PDA are not read/scanned but are never changed or written into whereas the cells of 
the tape of TM are written also. 

— The tape head of a PDA always moves from left to right however the tape head TM can move in both the 
direction. 
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8 . Explain about the Unrestricted Grammar. 

An unrestricted grammar is used to describe he languages that are not context free. The language described by 
unrestricted grammar can be accepted by TM. The example shows the unrestricted grammar generating 
{a n b n c n | n>l} 

S —»FS1 


SI 

-► ABCS1 

SI 

ABC 

BA 

^ AB 

CA 

—> AC 

CB 

—> BC 

FA 

—► a 

aA 

—>aa 

aB 

—> ab 

bB 

->bb 

bC 

^bc 

cC 

—> CC 

S = 

FS1 


= FABCS1 


=FABCABC 
| aBCABC 
= aBACBC 
= aABCBC 
= aaBCBC 
= aabCBC 
= aabBCC 
= aabbCC 
= aabbcC 
= aabbcc 

9. Show that a language L is accepted by some DFA if and only if L is accepted by some NFA. 

A language L is acceptance of some NFA if L is accepted by DFA (NFA=DFA). 

Proof: 

Let D = (Q d , X, §d, q 0 , F D ) be DFA, define a NFA N = (Q N , £, 5 N , q 0 , F N ) equivalently where, Q d =Qn, F d =F n 
5 n is defined by if 5 D (q,a) = p then, 5 N = {p}. 

We have to show that, if 5 D (q 0 , w) = p then 5 N (q 0 , w) = {p}. 

Using Induction: 

1. Basis: 

Let |w| = 0 i.e. w = e 

5d a (q 0 , w) = 5 d a (q 0 , e) = q G 

5 n a (q 0 , w) = 5 n a (q 0 , e) = {q 0 } 

Since, q G ’ = q 0 

2. Inductive: 

Let |w|=n+l & w=xa where x is a substring of w without last symbol then |x|=n |a|=l 

The inductive hypothesis, 

if 5 d (q 0 , xa) = p then 8 N A (q 0 , xa) = {p} 

Now, 5 d a (q 0 , xa) = 5 D (§d a (q 0 , x), a) = 5 D (p, a) = r (say) 

5 n a (q 0 , xa) = 5 n (5 d a (q 0 , x), a) = 5 N ({p}, a) = r (say) 

Therefore, 5 D A (q 0 , xa) = 5 D A (q 0 , xa) = r. 
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10. State and prove pumping lemma for regular language. Show by example how it can be used to prove a 
language is not a regular. 

Statement: Let L be regular language then there exist an integer constant n so that for any xeL with |x|>n there 
are string u,v and w such that x=uvw |uv|>n, |v|>0 then uvkweL A k>0. 

For proof see the chapter CFG and examples 

11. Define Context Free Grammar. Given the following CFG 

S—>0AS | 0, A—»S1A | SS | 10 

For the string 001001100, give the left most and right most derivation and also construct a parse tree. 

A context free grammar is defined by 4 tuples (V, T, P, S) where, 

V = set of vertices 
T = set of terminal symbols 

P = set of rules or production 3 

S = set of start symbols SeV 
For the left most derivation 

S —» 0AS 
—»0S1AS 
—► 00AS1AS 
—►0010S1AS 
—► 001001AS 
^OOlOOllOS 
^001001100 

For the right most derivation 

S —► 0AS 
—► 0A0 
—>0S1A0 
^0S1100 
—> 00AS1100 
-> 00A01100 
->001001100 

12. Define deterministic PDA. Design a PDA that ac nguage {a n b n | n>0}. You may accept 

either by empty stack or by final state. 

The PDA is a deterministic PDA if and only if thq following conditions a£p met: 

— 5 (q, a, X) has at most one member for any q in Q, a in £ or a = e, and X in T. 

— If 5 (q, a, X) is nonempty, for some a in £, then 5 (q, e, X) must be empty. 

For poof see the chapter PDA 

13. Describe a Universal Turing Machine and its operations. What types of languages are accepted by 
Universal TM? 

A universal Turing machine (UTM) is a Turing machine that can simulate an arbitrary Turing machine on 
arbitrary input. The universal machine essentially achieves this by reading both the description of the machine 
to be simulated as well as the input thereof from its own tape. 

The operation of a Turing machine proceeds as follows: 

1. The Turing machine reads the tape symbol that is under the Turing machine’s tape head. This symbol is 
referred to as the current symbol. 






TPL(anithub. technovative@gmail.com] 


Page 60 













TOC preparation kit 2012 


2. The Turing machine uses its transition function to map the current state and current symbol to the 
following: the next state, the next symbol and the movement for the tape head. If the transition function 
is not defined for the current state and current symbol, then the Turing machine crashes. 

3. The Turing machine changes its state to the next state, which was returned by the transition function. 

4. The Turing machine overwrites the current symbol on the tape with the next symbol, which was returned 
by the transition function. 

5. The Turing machine moves its tape head one symbol to the left or to the right, or does not move the tape 
head, depending on the value of the ’movement' that is returned by the transition function. 

6. If the Turing machine's state is a halt state, then the Turing machine halts. Otherwise, repeat sub-step #1. 

The type of language accepted by Universal TM is the class of Type 0 languages. 


14. Explain about the Chomsky Hierarchy of the language. 

Noam Chomsky defined four classes of grammars, which define four classes of languages. These are arranged in a 
hierarchy: each class includes the one below it. The hierarchy is strict , meaning that there exist languages of each 
type that do not belong to the next higher type. 

• Type 0: recursively enumerable languages (unrestricted grammars) 

• Type 1: context-sensitive languages (context-sensitive grammars) 

• Type 2: context-free languages (context-free grammars) 

• Type 3: regular languages (right-linear and left-linear grammars) 

QUESTION ANSWERS 

Construct DFA of the following problems 
All strings that contain exactly 4 Os 


11111 



All strings ending in 1101 
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All strings whose binary interpretation is divisible by 5 

1 



All strings that contain the substring 0101 


0,1 




All strings that don’t contain the substring 110 


1 0,1 
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Construct NFA of the following problems 

All strings containing exactly 4 Os or an even number of Is 


11111 



All strings such that the third symbol from the right end is a 0 



All strings such that some two zeros are separated by a string whose length is 4i for some i>=0 
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0,1 



0,1 



All strings that contains an even number of Os or exactly two Is 

l l 



Consider the Finite Automaton below. 



Construct the smallest 


Deterministic Finite Automaton, which accepts the same language. Finally, draw a regular expression that 
represents the language accepted by your machine and draw a Regular Grammar that generates it. 
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Solution 

a. Converting to a DFA 



b. Minimizing DFA 



c. Converting to a regular expression 

0 + ( 00 + 1 + 11 ) 00 * 

d. Converting to a regular grammar using the NFA 

A ->0 | OB | 1C | ID 
B -> 0D 
C-> ID 
D -> 0 | 0D 

Using the DFA: 

S -> 0A | IB 
A -> 0C | e 
B -> 0D | 1C 
C-> 0D 

Using the regular expression: 

S -> 0 | 110A | 000A | 10A 
A -> 0 | 0A | e 

Give CGF of the following languages 

a. The language {w | w starts and ends with the same symbol} 

S -> 0A0 | 1A1 
A -> 0A | 1A | e 

b. The language {w | the length of w is odd} 

S -> 0A | 1 
A -> OS|IS | e 

c. The language {w | the length of w is odd and its middle symbol is a zero} 

S -> 0 I 0S0 I 0S1 I ISO I 1S1 

d. {0 n l n I n > 0} u {0 n l 2n I n>0} 

S -> 0A1 | OBI 1 
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A -> 0A1 | e 
B -> OB11 | e 

e. {O 1 l j 2 k | i ^ j or j ^ k} 

S -> AC | BC | DE | DF 
A -> 0 | OA | 0A1 
B -> 1 | B1 | OBI 
C -> 2 | 2C 
D -> 0 | OD 
E -> 1 | IE | 1E2 
F -> 2 | F2 | 1F2 

f. Binary strings with twice as many ones as zeros 

S -> e | 0S1S1S | 1S0S1S | 1S1S0S 


Explain why the grammar below that generates strings with an equal number of 0’s and l’s is ambiguous. 

S —>0A | IB 
A —>0AA | IS | 1 
B^IBB | OS | 0 

The grammar is ambiguous because we can find strings, which have multiple derivations: 


s 

S 

OA 

OA 

OOAA 

OOAA 

OOISI 

0011S 

OOllBl 

00110A 

001101 

001101 


Put the following grammar into Chomsky Normal Form . Show all work. 
S -> A | ABO | A1A 
A -> AO | e 
B ->B1 | BC 
C -> CB | CA | IB 


Remove all e rules 
S -> e | A | ABO | A1A | BO | A1 | 1A 
A -> AO | 0 
B ->B1 | BC 
C -> CB | CA | IB 

Remove unit rules 

S -> e | AO | 0 | ABO | A1A | BO | A1 | 1A 
A -> AO | 0 
B ->B1 | BC 
C -> CB | CA | IB 


Convert remaining rules into proper form 
S -> e | AO | 0 | AS1 | BO | A1 | 1A 
A -> AO | 0 
B ->B1 | BC 
C -> CB | CA | IB 
Si -> BO | 1A 

S -> e | AN 0 | ASi | BN 0 | ANi | NiA 
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A -> AN 0 | 0 
B -> BNj | BC 
C -> CB | CA | NjB 
Si -> BN 0 | NiA 
N 0 ->0 
N, -> 1 

Convert the following grammar to an equivalent one with no unit productions and no useless symbols. Show that 
the original grammar had NO useless symbols. What useless symbols are there after getting rid of unit 
productions? 

S -> A | CB 
A -> C | D 
B -> IB | 1 
C -> OC | 0 
D -> 2D | 2 

Converts to 

S -> OC | 0 | 2D | 2 | CB 
A -> C | D 
B -> IB | 1 
C -> OC | 0 
D -> 2D | 2 

A is now useless and can be removed. 

Consider the following NFA over the alphabet {0,1}: 


0 1 



• Convert this NFA to a minimal DFA. 

• Write a regular expression for the set the machine accepts. 

• Write a linear grammar where each right side is of the form aB or a. (“a” a terminal and “B” a non¬ 
terminal) to generate the set. 

Solution 


l 

b. [0+(0+l)(l+00)*01]*(0+l)(l+00)* 



C. 

A -> OA | OB | IB 
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B -> IB | OC | e 
C -> OB | 1A 


MODEL QUESTIONS 


SET-1 

1. Define DFA and its language. Design a DFA that accepts all strings over £ = {a, b} in which both a’s and b’s 
are odd. Show the acceptance of string abaa sign its extended transition function. 

2. State and prove Pumping Lemma for regular language. Show that L= {a m b m } | m>l} is not a regular 
language. 

3. What is regular expression? Give the regular expression for the following languages. 

a) RE over {0, 1} whose first and fifth symbol is 1. 

b) RE of strings over {0, 1} in which 0 appears 3 times of any. 

4. Find the minimum state DFA equivalent to following DFA. 


States 

0 

1 

—► A 

B 

F 

B 

G 

C 

*C 

A 

C 

D 

C 

G 

E 

H 

F 

F 

C 

G 

G 

G 

E 

H 

G 

C 


5. Define Regular Grammar. Construct equivalent FA accepting the language generated by the following 
grammar. 

S—>aabB | aaC | aba 
A—>abA | aA | bB | 8 
B—>aB | baA 
C—>aC | abb 

6. Show that any CFL without 8 can be generated by a grammar in CNF. Convert following CFG into CNF. 

S—>ASB | s 

A—>aAS | a 
B—>SbS | A | bb 

7. Define CFG and PDA. What is the relationship between them? Convert the following CFG into PDA. 
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E—>11 E*E | E-E | E/E | (E) 

I->a | b | la | lb 110 | II 

8. What is Turing Machine? How does it differ from PDA? Describe the different variations of Turing Machine. 

9. Describe Universal Turing Machine and its operation. 

10. Write short notes (any three) 

a) Epsilen-Closure of a state. 

b) CNF-SAT problem 

c) Church-Turing Thesis 

d) Parse-Tree 

SET-2 

1. What is the difference between DFA and NFA? Explain with their formal definition and examples. Design a 
DFA that accepts the set {abc, abd, aacd} over £={a, b, e, d}. 

2. Define epsilon-closure of a state of an e-NFA. Construct the following e-NFA into equivalent DFA. 


State 

s 

0 

1 

—►A 

{B, D} 

{A} 

0 

B 

0 

{Q 

{E} 

C 

0 

0 

{B} 

D 

0 

{E} 

{E>} 

* E 

0 

0 

0 


3. Show that for any regular expression r, there is a s-NFA that accepts the same language represented by r. 
convert the following regular expression into s-NFA. 

1 ( 1 + 10 )* + 10 ( 0 + 10 )* 0 

4. Find the minimum state DFA equivalent to following DFA. 


Q 

0 

1 

—►A 

B 

A 

B 

A 

C 

C 

D 

B 
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* D 

D 

A 

E 

D 

F 

F 

G 

E 

G 

F 

G 

H 

G 

D 


5. Write a CFG that generates the following set over {a, b}: {aV +1 , i>l}. Convert the grammar into Chomsky’s 
Normal Form. 

6. Define parse tree of a string generated by a grammar. What is its nature if the grammar is in CNF? Show that 
if the longest path of the parse tree with yield w from its root to leaf is n, then |w|<=2 n_1 . 

7. Define deterministic PDA. Design a PDA to accept a language L={O n l n | n>=l}. You may accept either by 
empty stack or by final state. 

8. What is Universal Turing Machine and its Language. Explain its operation. 

9. Consider the Turing Machine M, 

M= «q 0 , qi, q 2 , f}, {0, 1}, {0, 1, B}, 5, q 0 , B, {f}) 5 
Whose transitions are defined below. 

5(q 0 , 0) = {(q 1? 1, R)} 5(q 1? 1) = {(q 2 , 0, L)} 

5(q 2 , 1) = {(q 0 , 1, R)} 5(q 1? B) = {(f, B, R)} 

a) Provide the execution trace of this machine on the input Oil. 

b) Describe the language accepted by M. 

c) Encode the above Turing Machine. 

10. Write short note (any two): 

a. Universal Turing Machine 

b. Regular Grammar 

c. PDA vs CFG 

d. Unrestricted Grammar 

SET 3 

Attempt all questions 

Group ‘A’ [8*4 = 32] 

1. Prove the following by induction 2 X > x 2 if x > 4. 

2. Write the advantages/disadvantages of NFA over DFA. 

3. Design a NFA for the language L = all strings over {0, 1} that have at least two consecutive 0’s or 1 ’s. 

4. Write about the application of the Pumping Lemma. 
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5. Describe the closure properties of Regular Languages. 

6. Describe in brief about ambiguity in grammars and languages. 

7. Write in short about Chomsky Hierarchy. 

8. Define P and NP class. Describe any two problems that lie in P and NP class. 


Group ‘B’ [6*8 = 48] 

9. Find the regular expression representing the following sets: 

(i) The set of all strings over {0, 1} having at most one pair of O’s or at most one pair of l’s. 

(ii) The set of all strings over {a, b} in which there are at least two occurrence of b between any two 
occurrences of a. 

10. Write the CFG for the language L = {a 2n b m /n > 0, m>=0} 

11. Consider the NFA given by following diagram 



12. Design PDA for the grammar G = (V n , V t , P, S) 
where 

V n ={S} 

V,= {a, b, c} 

and P is defined as 

S—>aSa 
S—>bSb 
S—>c 

13. Design a Turing Machine which accepts the language L = {w e (a, b)*/w has equal number of a’s and b’s}. 

Or 

Prove that PCP with {(01, Oil), (1, 10), (1, 11)} has no solution. 

14. Write short notes (any two ) 

(i) Church Thesis 

(ii) Halting Problem 

(iii) Universal Turing Machine 

(iv) Parse Tree 


SET -4 


Group-A r8*4=321 
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1. Define the finite Automata. Describe its application in the field of Computer science. 

2. Define e-closure of a state. 

3. Differentiate between Moore and Mealy Machine. 

4. Define regular operators and regular language with example. 

5. What do you mean by ambiguity in grammar and languages? Describe with suitable example. 

6. Define a FA which accepts set of strings containing four l’s in every string over alphabet £= {0, 1}. 

7. What is Regular expression? Give regular expression for the following languages. 

a) Strings over {0, 1} begin with 00 and end with 00. 

b) Strings over {0, 1} that begin or end with 00 or 11. 

8. Describe Recursive and recursively enumerable language. 

Group-B f6*8=481 

9. State and prove Pumping Lemma for regular language. Show that L= {a m b m |rn>=l} is not a regular language. 

Or, 

State and prove Pumping Lemma for CFL.By using CFL Pumping Lemma show that the language 
{0 n l n |n>=l} is not a context free. 

10. Define CNF.Change the following grammar into CNF. 

S—>abSb/a/aAb. 

A—>bS/aAAb. 

11. Give the formal definition of Push Down Automata.Convert the given grammar into PDA. Also show 
complete sequences of ID’s of PDA to accept string a+a. 

S—>S+T|T 
T—>T*F|F 
F—KS) |a 

12. Define Turing Machine. Write down its application.Also build a Turing machine that accepts the language of 
all words that contain the substring bbb 

Or, 

Define Turing Machine with its block diagram.Also bulid a turing machine that accepts the language ODD 
PALINDROME. 

13. Define P,NP and NP complete problem.Explain CROOK theorem. 

14. What is Chomsky Hierarchy of the language? Differentiate the type-1 and type-2 of this language? 
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Technovative is a newly established organization with an aim of "dedicated to customer 
satisfaction. " The organization is established by a team of IT veteran experts in the field of 
programming, system development, web programming, networking, database and other 
computer related offshoots. The company seeks to use the human power produced in Nepal in 
its own country with an aid ofpromoting the standard of living of Nepalese people. 


ALSO AV AI LABLE 

• Programming books (C, C++, JAVA, etc) 

• Computer fundamentals for school level 

• Professional course books 

• E-Books on all Academic courses 

• Preparatory Kits of all subjects 

Contact Us 

• http://www.facebook.com/Technovatives 

• http://www.technovative.org 

• technovative.anithub@gmail.com 

• 9849709357; 9841293896; 9849182179; 9841225751;9803740363 


Please “Like” OUR facebook fan page to get the recent updates and e-books on all academic courses. 
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